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5 REGULATION OF GENE EXPRESSION BY PROTEIN METHYLATION 
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RELATED APPLICATION DATA 

This application claims priority to provisional application Serial No. 60/1 12,523 
1 5 filed December 15,1 998, the entire disclosure of which is herein incorporated by reference. 

GOVERNMENT SUPPORT 

The government may have certain rights in this invention pursuant to grants 
DK43093 and NS17269 from the National Listitutes of Health. 

FIELD OF THE INVENTION 

20 The invention relates to coactivators of transcription and to proteins with protein 

methyltransferase activity. 

BACKGROUND 

The activities of all cells are conducted primarily by the thousands of different 
types of proteins each cell produces. The blueprint or code for synthesizing each protein is found 
25 in a corresponding gene, i.e., each gene encodes the information needed to synthesize a specific 
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protein. Gene "expression" results in the production of the protein by a stepwise mechanism that 
includes 1) "transcription" of the gene by RNA polymerase to produce a messenger RNA 
(mRNA) that contains the same protein-encoding information; and 2) "translation" of the mRNA 
by ribosomes to produce the protein. Each gene is expressed in specific tissues and at specific 

5 times during the Ufe of the organism. Expression of most genes is regulated in response to a 
variety of signals that arise either outside or inside the organism. This pattem of specific 
expression for each gene is determined by the "promoter region" of each gene, which is located 
adjacent to the protein-encoding region of the gene. Each gene's promoter contains many 
"regulatory elements." Each regulatory element serves as a binding site for a specific protein, 

1 0 and the binding of the appropriate protein to a specific regulatory element can cause 

enhancement or repression of gene expression. Together, the regulatory elements and the 
proteins that bind to these elements determine the expression pattem for the specific gene. 

Hormones represent one of the most important mechanisms for communication 
between different organs and tissues in multicellular organisms. In mammals, hormones are 
15 synthesized in one organ or tissue, and travel through the blood stream to various target organs. 
By interacting with specific receptor proteins in the target cells, the hormones change the 
activities of the cell. Frequently the cellular effects of the hormone include changes in the 
expression of specific genes. The protein products of these genes then carry out the biological 
actions that result in altered cellular fimctions. 

20 The effects of one extremely important class of hormones are carried out by a 

family of related receptor proteins called the nuclear receptors (Evans, R.M. (1988) Science 
240:889-895; Tsai, M-J. and B.W. O'Malley (1994) Annu, Rev, Biochem. 63:451-486; Beato, 
M., et al (1995) Cell 83:851-857). This family of proteins includes the receptors for all of the 
steroid hormones, thyroid hormones, vitamin D, and vitamin A, among others. The family also 

25 includes a large number of proteins called "orphan receptors" because they do not bind any 
hormone or because the hormone that binds to them is unknown; but they are nevertheless 
structurally and fiinctionally related to the hormone-binding nuclear receptors. Nuclear receptors 
are transcriptional regulatory proteins that act by a common mechanism. For those nuclear 
receptors that do bind hormones, the appropriate hormone must enter the cell and bind to the 
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nuclear receptors, which are located inside the target cells. The activated nuclear receptors bind 
to specific regulatory elements associated with specific genes that are regulated by these proteins. 
Binding of the activated nuclear receptors to the regulatory elements helps to recruit RNA 
polymerase to the promoter of the gene and thereby activates expression of the gene. This 

5 mechanism also appUes to many of the orphan nuclear receptors. 

After nuclear receptors bind to a specific regulatory element in the promoter of the 
gene, they recruit RNA polymerase to the promoter by a mechanism which involves another 
group of proteins called coactivators, that are recruited to the promoter by the nuclear receptors 
(Horwitz, K,B. et al (1996) MoL Endrocrinol 10:1167-1177; Glass, C.K. et al (1997) Curr, 

10 Opin. Cell Biol 9:222-232). The complex of coactivators helps the receptors to activate gene 
expression by two different mechanisms: 1) they make the gene more accessible to RNA 
polymerase by unfolding the "chromatin." Chromatin is composed of the DNA (which contains 
all the genes) and a large group of DNA-packaging proteins. To unfold chromatin some of the 
coactivator proteins contain an enzymatic activity known as a histone acetyltransferase (HAT). 

1 5 HAT proteins transfer an acetyl group from acetyl Co A to the major chromatin proteins, which 
are called "histones." Acetylation of the histones helps to unfold chromatin, thus making the 
gene and its promoter more accessible to RNA polymerase. 2) The coactivators and the nuclear 
receptors make direct contact with a complex of proteins called basal transcription factors that 
are associated with RNA polymerase; this interaction recruits RNA polymerase to the promoter. 

20 Once RNA polymerase binds to the promoter, it initiates transcription, i.e., synthesis of mRNA 
molecules. The final activation of RNA polymerase after it binds to the promoter may also 
require some intervention by the coactivator proteins, but little is known about the mechanism of 
these final steps of transcriptional activation. 

One specific family of three related coactivator proteins, the "nuclear receptor 
25 coactivators" or "pi 60 coactivators" (because their mass is approximately 160 kilodaltons), are 
required for the gene activation activities of many of the nuclear receptor proteins. The three 
related nuclear receptor coactivators are GRIPl, SRC-1, and p/CIP; all three proteins also have 
additional names that are used by some investigators (Onate, S.A. et al (1995) Science 
270:1354-1357; Hong, H. et al (1996) Proc, Natl Acad. Sci. USA 93:4948-4952; Voegel, J.J. et 
30 al (1996) EMBO J. 15:3667-3675; Kamei, Y. et al (1996) Cell 85:403-414; Torchia, J. et al 
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(1997) Nature 387:677-684; Hong, H. et al. (1997) Mol. Cell. Biol. 17:2735-2744; Chen, H. et 
al. (1997) Cell 90:569-580; Anzick, S.L. et al. (1997) Science 277:965-968; Li, H. et al. (1997) 
Proc. Natl. Acad. Sci. USA 94:8479-8484; Takeshita, A. et al. (1997) J. Biol. Chem. 272:27629- 
27634). These coactivators are recruited directly by the DNA-bound nuclear receptors. The 

5 nuclear receptor coactivators, in turn, recruit other coactivators, including CBP (or p300) and 
p/CAF (Chen, H. et al. 1997). All of these coactivators have been shown to play roles in gene 
activation by one or both of the two mechanisms mentioned above. Some of them have HAT 
activities to help unfold chromatin structure (Chen, H. et al. 1997; Spencer, T.E. et al. (1997) 
Nature 389:194-198), and others have been shown to make direct contact with proteins in the 

10 RNA polymerase complex (Chen, H. et al. 1997; Swope, D.L. et al. (1996) J. Biol. Chem. 
271:28138-28145). Thus, the discovery and characterization of these coactivators provides a 
better understanding of the mechanism by which nuclear receptors activate gene transcription. 

Histones are known to be methylated as well as acetylated (Annunziato, A.T. et 
al. (1995) Biochem. 34:2916; Gary J.D. and Clarke, S. (1998) Prog. Nucleic Acids Res. Mol. 

15 Biol. 61:65). However, the function ofhistonemethylation is unknown. Methylationof histone 
H3, is a dynamic process during the lifetime of histone molecules, and newly methylated H3 is 
preferentially associated with chromatin containing acetylated H4 (Annunziato, A.T. et al. 1995); 
thus methylation of H3, like acetylation of H4, is associated with active chromatin. In other 
studies lysine methylation of histones has been found in a variety of organisms; arginine 

20 methylation of histones, while not clearly documented in mammals, has been demonstrated in 
other classes of organisms (Gary and Clarke 1998). In Drosophila cells heat shock treatment 
causes increased arginine methylation of histone H3, which could be associated with activation 
of heat shock genes or repression of the other genes (Desrosiers, R. and R.M. Tanguay (1988) J. 
Biol. Chem. 263:4686). 

25 Proteins can be N-methylated on amino groups of lysines and guanidino groups of 

arginines or carboxymethylated on aspartate, glutamate, or the protein C-terminus. Recent 
studies have provided indirect evidence suggesting roles for methylation in a variety of cellular 
processes such as RNA processing, receptor mediated signaling, and cellular differentiation 
(Aletta, J.M. et al. (1998) Trends Biochem. Sci.: 23:89; Gary and Clarke 1998). However, for the 
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most part the specific methyltransferases, protein substrates, and specific roles played by 
methylation in these phenomena have not been identified. Two types of arginine-specific protein 
methyltransferase activities have been observed, type I and type n. Genes for three mammahan 
and one yeast type I enzymes, which produce monomethyl and asymmetric dimethylarginine 

5 residues previously have been identified (Figure 1). On the other hand, type H protein arginine 
methyltransferases produce monomethyl and symmetric dimethylarginine residues. In vitro 
protein substi-ates for various protein arginine methyltransferases include histones and proteins 
involved in RNA metabolism such as hnRNPAl, fibrillarin, and nucleolin (Lin, W-J. et al. 
(1996) J. Biol. Chem. 271:15034-15044.; Gary, J.D. etal. (1996) J. Biol. Chem. 271:4585; 

10 Najbauer, J. et al. (1993) J. Biol. Chem. 268:10501-10509). The arginine residues methylated in 
many of these proteins are found in glycine-rich sequences, and synthetic peptides mimicking 
these sequences are good substiates for the same methyltransferases (Najbauer, J. et al. 1993). 

SUMMARY 

The invention relates to a transcriptional coactivator, Coactivator Associated 
15 arginine (R) Methylti^sferase (CARMl). 

One aspect of the invention includes CARMl cDNA polynucleotides such as 
(SEQ ID NO: 1). Polynucleotides include those with sequences substantially equivalent to SEQ 
ID NO: 1, including fi-agments thereof Polynucleotides of the present invention also include, but 
are not limited to, a polynucleotide complementary to the nucleotide sequence of SEQ ID NO: 1 . 

20 Polynucleotides according to the invention have numerous applications in a 

variety of techniques known to those skilled in tiie art of molecular biology. These techniques 
include use as hybridization probes, use as oligomers , i.e. primers for PGR, use for chromosome 
and gene mapping, use in the recombinant production of protein, and use in generation of 
antisense DNA or RNA, their chemical analogs and the like. For example, when the expression 

25 of an mRNA is largely restricted to a particvilar cell or tissue type, polynucleotides of the 

invention can be used as hybridization probes to detect the presence of the specific mRNA in the 
particular cell or tissue RNA using, e.g., in situ hybridization. The invention also includes 
vectors encoding the polynucleotides of the invention. 
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The invention also describes the deduced amino acid sequence of the CARMl 
protein (SEQ JD NO: 2). The invention also describes isolated CARMl proteins. 

The polypeptides according to the invention can be used in a variety of procedures 
and methods that are currently applied to other proteins. For example, a polypeptide of the 
5 invention can be used to generate an antibody that specifically binds the polypeptide. The 
invention describes antibodies that specifically interact v^th the CARMl protein or fragments 
thereof 

The polypeptides of the invention also act as methyltransferases of histones and 
other proteins and can therefore be used for the study of methylation processes in transcription 
10 and to methylate amino acid residues within histones and other proteins. 

Methylated proteins produced by the methods of the invention can be used to 
identify demethylating enzymes. Methylated histones, for example, can be used to screen for 
demethylating enzymes. 

The methods of the present invention further relate to the methods for detecting 
15 the presence of the polynucleotides or polypeptides of the invention in a sample. Such methods 
can, for example, be utilized as a prognostic indicator of diseases that involve CARMl, modified 
forms of CARMl, or altered expression of CARMl . 

Methods are also provided for identifying proteins that interact v/ith CARMl as 
well as methods for screening of drugs that alter CARMl's interactions with other proteins. 

20 Another aspect of the invention is to provide methods to screen for molecules that 

alter CARMl methyltransferase activity. 

BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 shows a comparison of the region of highest homology between 
CARMl, three other mammalian protein arginine methyltransferases (Lin, W-J. et al 1996; 
25 Tang, J. etaL (1998) J. Biol Chem, 273:16935; Scott, H.S. etal (1998) Genomics 48:330.) and 
one yeast protein arginine methyltransferase (Gary, J.D. et al 1996); the sequences are shown. 
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with dashes (-) representing the same amino acid as in CARMl and dots (.) representing spaces 
inserted for optimum ahgnment. The location of a VLD-to-AAA mutation used in these stiidies 
is indicated. 

Figure 2 shows the expression of CARMl mRNA in various adult mouse tissues 
5 as examined by hybridizing a 0.6-kb BamHI cDNA fragment (representing CARMl codons 
3-198) to a multiple tissue northern blot (Clontech) as described previously (Hong, H. et al. 
1997). Positions of RNA size markers are shown on the left. 

Figure 3 shows the binding in vitro of CARMl (SEQ ID NO: 2) and a CARMl 
variant (SEQ ID NO: 3) to the C-terminal region of pl60 coactivators. 

10 Figure 4 shows binding of CARMl to GRIPl in vivo, i.e. in living yeast. 

Sub-fragments of the GRIPl C-terminal domain (GRIPlc), fused with the GaWDBD, were tested 
in the yeast two-hybrid system as described previously (Ding, X.F. et al. (1998) Mol Endocrinol 
12:302) for binding to CARMl or to a-actinin, fused to Gal4AD. p-galactosidase (p-gal) 
activity indicates interaction between the two hybrid proteins. 

15 Figure 5 shows tiie enhancement by CARMl of reporter gene activation by 

Gal4DBD-GRIPlc. A) CV-1 cells in 6-well dishes (3.3 cm diameter well) were transiently 
transfected with 0.5 of ^g pM.GRIPlc (coding for Gal4DBD-GRIPlc where GRIPlc is GRIPl 
amino acids 1 121-1462), 0.5 ^g of GKl reporter gene (luciferase gene controlled by Gal4 
binding sites) (Webb, P. et al, (1998) Mol Endocrinol 12:1605), and 0 - 0.8 fxg of 

20 pSG5.HA-CARMl , using Superfectin (Qiagen) according to manufacturer's protocol. Total 
DNA was adjusted to 2.0 |ag per well with the appropriate amount of pSG5. Cell extracts were 
prepared approximately 48 h after transfection and assayed with Promega Luciferase Assay kit. 
Relative light units of luciferase activity presented are the mean and standard deviation of three 
transfected wells. B) CV-1 cells were transfected as in A with tiie indicated amount of 

25 pM.GRIPlc and zero or 0.5 (ig of pSG5.HA-CARMl . 

Figure 6 shows the enhancement by CARMl of reporter gene activation by 
nuclear receptors and the elimination of CARMl coactivator fimction by the VLD-to-AAA 



7 



mutation. (A) Transient transfection assays with CV-1 cells were performed as in Figure 5 with 
0.5 ng of GKl reporter gene and 0.5 ^g of each of the indicated vectors. (B) CV-1 cells were 
transiently transfected as in Figure 5 with the following vectors, as indicated: 0.5 ^ig of nuclear 
receptor expression vector pSVARo (Brinkmann, A.O. et al. (1989) Steroid Biochem. Molec. 

5 Biol. 34:307) expressing AR, pHEO (Green, S. et al. (1988) Nucleic Acids Res. 16:369) 

expressing ER, or pCMX.hTR|31 (Feng, W. et al. (1998) Science 280:1747) expressing TR; 0.5 
)ag of a luciferase reporter gene with an appropriate promoter, MMTV promoter for AR, or 
MMTV promoter with the native glucocorticoid response elements replaced by a single estrogen 
response element for ER or palindromic thyroid hormone response element for TR (Umesono, K. 

10 and R.M. Evans (1989) Cell 57:1139); 0.5 ^g of pSG5.HA-GRIPl; and 0.5 ^ig of 

pSG5.HA-CARMl or pSG5.HA-CARMl(VLD mutant). Transfection efficiency was monitored 
by using pgalactosidase activity expressed from 0.1 ^ig of co-transfected pCMV-pgal vector 
(Hong, H. et al (1996) Proc. Natl. Acad Set USA 93:4948-4952) as an internal control. After 
transfection, cells were grown in charcoal-treated serum; where indicated 20 nM hormone (H), 

15 i.e. dihydrotestosterone for AR, estradiol for ER, or triiodothyronine for TR, was included 
during the last 40 h of culture. The data is representative of three independent experiments. 

Figure 7 shows a model for primary and secondary coactivators of nuclear 
receptors (NR). Nuclear receptor dimers bind directly to the hormone response element (HRE) 
and activate transcription by recruiting coactivators, which open chromatin structure (signified by 

20 nucleosome) and recruit a transcription initiation complex (TIC), composed of RNA polymerase 
n (Pol n), basal transcription factors such as TBP and TFIIB, and a large complex of accessory 
proteins (Chang, M. and J.A. Jaehning (1997) Nucleic Acids Res. 25:4861). GRIPl and other 
pi 60 family members serve as primary coactivators in this case, binding directly to the NRs. 
CBP, p/CAF, and CARMl are recruited by the primary coactivators and thus serve as secondary 

25 coactivators. Some coactivators (e.g. CBP) may help to recruit the TIC through direct 

interactions with basal transcription factors. Some coactivators (e.g. CBP and p/CAF) can 
acetylate histones, using acetyl-CoA (AcCoA). We propose that CARMl's coactivator activity is 
due to its ability to methylate histones or other proteins in chromatin or the transcription 
initiation complex, using S-adenosyhnethionine {SAM) as methyl donor. 
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Figure 8 shows the histone methyltransferase activity of CARMl . CARMl and 
PRMTl have different protein methyltransferase substrate specificities. CARMl methylates 
histone H3, whereas PRMTl methylates histone H4. PRMTl was also previously shown to 
methylate other proteins, including hnRNP Al, but had not previously been shown to methylate 

5 histone H4. (A,B) Calf thymus histones (BoehringerMannheim) were incubated for 30 min at 
30° C in 32.5 ^1 reactions containing 20 mM Tris-Cl, 0.2 M NaCl, 4 mM EDTA, pH 8.0, 0.32 
mg/ml individual histone (2a, 2b, 3, or 4) or 1.3 mg/ml mixed histone (M), 0.037 mg/ml 
GST-CARMl orGST-PRMTl, and? ^M S-adenosyl-L-[/ne%/-^H]methionine (specific activity 
of 14.7 Ci/mmol). Reactions were stopped by addition of SDS-NuPAGE sample buffer (Novex), 

10 and 40% of each stopped reaction was then subjected to SDS-PAGE in 4-12% NuPAGE Bis-Tris 
^adient gels (Novex) usmg Ihe Na-MES running buffer. Gels were stained with Coomassie 
Blue R-250 (A), and then subjected to fluorography (Chamberlin, M. (1978) Anal. Biochem. 
98:132) for 12 h at -70 °C on sensitized Kodak XAR-5 film (B). Molecular weight markers 
(MW) are shown at left. Concentrations of GST fusion proteins were determined in comparison 

1 5 with bovine serum albumin standards (Sigma) by SDS polyacrylamide gel electrophoresis and 
Coomassie Blue staining; it was assumed that bovine serum albumin stained twice as intensely as 
most other proteins. Concentrations of histones and hnRNP Al were determined by the method 
of Lowry (Lowry, O.H. et al (1951) /. Biol. Chem. 193:265). (C) Methylation and 
electrophoresis were carried out as described above except that protein substrates were 2.7 mg/ml 

20 mixed histone (His), 0.083 mg/ml hnRNP- Al (Al), or no substrate (-), and the concentrations of 
GST-CARMl, GST-CARMl VLD mutant (VLB), and GST-PRMTl, were 0.05, 0.02, and 0.03 
mg/ml respectively. Two different preparations of the GST-CARMl VLD mutant failed to show 
detectable activity towards any substrate. Recombinant human hnRNP Al expressed in E. coli 
(Mayeda, A. A.R. and Krainer (1995) Cell 68:365) was kindly provided by Dr. A. Krainer (Cold 

25 Spring Harbor Laboratories, NY). 

Figure 9 shows that PRMTl can also serve as a coactivator for nuclear receptors. 
Furthermore, CARMl and PRMTl act cooperatively as enhancers of nuclear receptor function, 
i.e. the two together are at least as effective or more effective than the sum of their individual 
activities. Transient transfections were performed as in Figure 5. CV-1 cells were transfected 
30 with the following plasmids: expression vector for nuclear receptor (0.1 ^g pSVARo for 
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androgen receptor [AR], 0.001 \xg of pCMX.TRpl for thyroid receptor [TR], or 0.001 jag of 
pHEO for estrogen receptor [ER]), 0,25 [ig of reporter gene for each nuclear receptor as described 
in Figure 6B, 0.25 \xg of pSG5.HA-GRIPl, and the indicated amount of plasmids encoding 
CARMl orPRMTl. 

5 Figure 10 shows that at low levels of nuclear receptor expression, the hormone 

dependent activity of the nuclear receptors depends almost entirely on the presence of three 
different coactivators, at least one of which is a protein methyltransferase. Several different 
combinations of three coactivators work: A) Orphan nuclear receptors ERRS and ERRl, which 
require no ligand, are active without exogenously added coactivators when high levels of these 

10 nuclear receptors are expressed; but when low levels of these nuclear receptors are expressed 
(1 ng of expression plasmid in Figure lOA), GRIPl + CARMl + PRMTl is required for activity 
(226-fold over controls for ERRS and lS.8-fold over controls for ERRl). Omission of any one 
of these coactivators almost completely eliminated activity, pi 60 coactivators other than GRIPl 
could be substituted for GRIPl with similar results. B) When high levels of estrogen receptor are 

15 expressed in CV-1 cells (using 100 ng of ER expression vector), the estrogen receptor alone is 
active, and the activity is enhanced by GRIPl alone or GRIPl + one other coactivator (pSOO or 
CARMl) (right side of panel). However, when low levels of estrogen are expressed (using 1-10 
ng of ER expression vector) ER alone is almost inactive, and individual coactivators or 
combinations of any two coactivators cause little stimulation; activity is almost entirely 

20 dependent on the presence of GRIPl -f CARMl + pSOO (left side of panel). 

Figure 1 1 shows that PRMT2 and PRMTS also serve as coactivators for nuclear 
receptors. CV-1 cells were transiently transfected with expression vectors for the orphan (i.e. no 
ligand) nuclear receptor ERRS, and where indicated the expression vectors for GRIPl, CARMl, 
PRMTl, PRMT2, and PRMTS. Like PRMTl (Figs. 9 & 10), PRMT2 and PRMTS could 
25 enhance nuclear receptor function in cooperation with CARMl . 
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DETAILED DESCRIPTION 



DEFINITIONS 

The term "nucleotide sequence" refers to a heteropolymer of nucleotides or the 
sequence of nucleotides. One of skill in the art will readily discern from contextual cues which 

5 of the two definitions is appropriate. The terms "nucleic acid," "nucleic acid molecule" and 
"polynucleotide" are also used interchangeably herein to refer to a heteropolymer of nucleotides. 
Generally, nucleic acid segments provided by this invention may be assembled from fragments of 
the genome and short oligonucleotide linkers, or from a series of oHgonucleotides, or from 
individual nucleotides, to provide a synthetic nucleic acid which is capable of being expressed in 

10 a recombinant transcriptional unit comprising regulatory elements derived from a microbial or 
viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment," "portion," 
or "segment" refer to a sfretch of nucleotide residues which is long enough to use in polymerase 
chain reaction (PGR) or various hybridization procedures to identify or amphfy identical or 
1 5 related parts of mRNA or DNA molecules. 

"Oligonucleotides" or "nucleic acid probes" are prepared based on the 
polynucleotide sequences provided herein. OHgonucleotides comprise portions of such a 
polynucleotide sequence having at least about 15 nucleotides and usually at least about 20 
nucleotides. Nucleic acid probes comprise portions of such a polynucleotide sequence having 
20 fewer nucleotides than about 3 kb, usually fewer than 1 kb. After appropriate testing to eliminate 
false positives, these probes may, for example, be used to determine whether specific mRNA 
molecules are present in a cell or tissue. 

The term "probes" includes naturally occurring or recombinant or chemically 
synthesized single- or double-sfranded nucleic acids. They may be labeled by nick translation, 
25 Klenow fill-in reaction, PGR or other methods well known in the art. Probes of the present 

invention, their preparation and/or labeling are elaborated in Sambrook, J. et al, 1989. Molecular 
Cloning: A Laboratorv Manual, Gold Spring Harbor, New York; or Ausubel, F. et al, 1989, 
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Current Protocols in Molecular Biology, John Wiley & Sons, New York, both of which are 
incorporated herein by reference in their entirety. 

The term "recombinant," when used herein to refer to a polypeptide or protein, 
means that a polypeptide or protein is derived from recombinant (e.g., microbial, mammalian, or 
insect-based) expression systems. "Microbial" refers to recombinant polypeptides or proteins 
made in bacterial or fungal {e.g., yeast) expression systems. As a product, "recombinant 
microbial" defines a polypeptide or protein essentially free of native endogenous substances and 
unaccompanied by associated native glycosylation. Polypeptides or proteins expressed in most 
bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast may have a glycosylation pattern in general different from those 
expressed in mammalian cells. 

The term "recombinant expression vector" refers to a plasmid or phage or virus or 
vector, for expressing a polypeptide from a polynucleotide sequence. An expression vector can 
comprise a transcriptional unit comprising an assembly of: 1) a genetic element or elements 
having a regulatory role in gene expression, for example, promoters or enhancers, 2) a structural 
or coding sequence which is transcribed into mRNA and translated into protein, and 3) 
appropriate transcription initiation and termination sequences. It may include an N-termmal 
methionine residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

The term "recombinant expression system" means host cells which have stably 
integrated a recombinant transcriptional unit into chromosomal DNA or carry the recombinant 
transcriptional unit extrachromosomally. Recombinant expression systems as defined herein will 
express heterologous polypeptides or proteins upon induction of the regulatory elements linked to 
the DNA segment or synthetic gene to be expressed. This term also encompasses host cells 
which have stably integrated a recombinant genetic element or elements having a regulatory role 
in gene expression, for example, promoters or enhancers. Recombinant expression systems as 
defined herein will express polypeptides or proteins endogenous to Ihe cell upon mduction of the 
regulatory elements linked to the endogenous DNA segment or gene to be expressed. The cells 
can be prokaryotic or eukaryotic. 
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The term "open reading frame," or "ORF," means a series of nucleotide triplets 
coding for amino acids without any termination codons and is a sequence translatable into 
protein. 

The term "active" refers to those forms of the polypeptide which retain a biologic 
5 and/or immunologic activity or activities of any naturally occurring polypeptide. An active 

polypeptide can possess one activity of a polypeptide, but not another, e.g., possess pi 60 binding 
activity but lack methyltransferase activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by 
cells that have not been genetically engineered and specifically contemplates various 
10 polypeptides arising from post-translational modifications of the polypeptide including, but not 
limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation and acylation. 

The term "derivative" refers to polypeptides chemically modified by such 
techniques as ubiquitination, labeling (e.g., with radionuclides or various enzymes), pegylation 
(derivatization with polyethylene glycol) and insertion or substitution by chemical synthesis of 
1 5 amino acids such as ornithine, which do not normally occur in human proteins. 

The term "recombinant variant" refers to any polypeptide differing from naturally 
occxirring polypeptides by amino acid insertions, deletions, and substitutions, created using 
recombinant DNA techniques. Guidance in determining which amino acid residues may be 
replaced, added or deleted without aboUshing activities of interest, such as catalytic activity, may 
20 be found by comparing the sequence of the particular polypeptide with that of homologous 
peptides and minimizing the number of amino acid sequence changes made in regions of high 
homology. 

Preferably, amino acid "substitutions" are the result of replacing one amino acid 
with another amino acid having similar structural and/or chemical properties, i.e., conservative 
25 amino acid replacements. Amino acid substitutions may be made on the basis of similarity in 
polarity, charge, solubility, hydrophobicity, hydrophiUcity, and/or the amphipathic nature of the 
residues involved. For example, nonpolar (hydrophobic) amino acids include alanine, leucine, 
isoleucine, valine, proline, phenylalanine, tryptophan, and methionine; polar neutral amino acids 
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include glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively 
charged (basic) amino acids include arginine, lysine, and histidine; and negatively charged 
(acidic) amino acids include aspartic acid and glutamic acid, "hisertions" or "deletions" are 
typically in the range of about 1 to 5 amino acids. The variation allowed maybe experimentally 
5 determined by systematically making insertions, deletions, or substitiitions of amino acids in a 
polypeptide molecule using recombinant DNA techniques and assaying the resulting recombinant 
variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or non- 
conservative alterations can be engineered to produce polypeptide variants. Such variants can, 

1 0 for example, alter one or more of the biological functions or biochemical characteristics of the 
polypeptides of the invention. For example, such alterations may change polypeptide 
characteristics such as ligand-binding affinities, interchain affinities, or degradation/turnover rate. 
Further, such alterations can be selected so as to generate polypeptides that are better suited for 
expression, scale up and the like in the host cells chosen for expression. For example, cysteine 

1 5 residues can be deleted or substituted with another amino acid residue in order to eliminate 
disulfide bridges. A variant's catalytic efficiency can be diminished through deletion or non- 
conservative substitution of residues important for catalysis. 

As used herein, "substantially equivalent" can refer both to nucleotide and amino 
acid sequences, for example a mutant sequence, tiiat varies fi-om a reference sequence by one or 

20 more substitiitions, deletions, or additions, the net effect of which does not result in an adverse 
ftmctional dissimilarity between the reference and subject sequences. Typically, such a 
substantially equivalent sequence varies firom one of those listed herein by no more than about 
20% (i.e., tiie number of individual residue substitutions, additions, and/or deletions in a 
substantially equivalent sequence, as compared to the corresponding reference sequence, divided 

25 by tihe total number of residues in the substantially equivalent sequence is about 0.20 or less). 
Such a sequence is said to have 80% sequence identity to the Usted sequence. Li one 
embodiment, a substantially equivalent, e.g., mutant, sequence of the invention varies firom a 
listed sequence by no more than 20% (80% sequence identity); in a variation of this embodiment, 
by no more than 10% (90% sequence identity); and in a fiirther variation of this embodiment, by 
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no more than 5% (95% sequence identity). Substantially equivalent, e.g., mutant, amino acid 
sequences according to the invention generally have at least 80% sequence identity with a listed 
amino acid sequence. 

A polypeptide "fragment," "portion," or "segment" is a stretch of amino acid 
5 residues of at least about 5 amino acids, often at least about 7 amino acids, typically at least about 
9 to 13 amino acids, and, in various embodiments, at least about 17 or more amino acids. To be 
active, any polypeptide must have sufficient length to display biologic and/or immunologic 
activity. 

Alternatively, recombinant variants encoding these same or similar polypeptides 
1 0 may be synthesized or selected by making use of the "redundancy" in the genetic code. Various 
codon substitutions, such as the silent changes which produce various restriction sites are well 
known in the art and may be introduced to optimize cloning into a plasmid or viral vector or 
expression in a particular prokaryotic or eukaryotic system. Mutations in the polynucleotide 
sequence may be reflected in the polypeptide or domains of other peptides added to the 
1 5 polypeptide to modify the properties of any part of the polypeptide, to change characteristics such 
as ligand-binding affinities, interchain affinities, or degradation/turnover rate. 

The term "purified" as used herein denotes that the indicated nucleic acid or 
polypeptide is present in the substantial absence of other biological macromolecules, e.g., 
polynucleotides, proteins, and the like, hi one embodiment, the polynucleotide or polypeptide is 
20 purified such that it constitutes at least 95% by weight, more preferably at least 99.8% by weight, 
of the indicated biological macromolecules present (but water, buffers, and other small 
molecules, especially molecules having a molecular weight of less than 1000 daltons, can be 
present). 

The term "isolated" as used herein refers to a nucleic acid or polypeptide 
25 separated from at least one other component (e.g. , nucleic acid or polypeptide) present with the 
nucleic acid or polypeptide in its natural source. In one embodiment, the nucleic acid or 
polypeptide is found in the presence of (if anything) only a solvent, buffer, ion, or other 
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component normally present in a solution of the same. The terms "isolated" and "purified" do 
not encompass nucleic acids or polypeptides present in their natural source. 

The term "infection" refers to the introduction of nucleic acids into a suitable host 
cell by use of a virus or viral vector. The term "transformation" means introducing DNA into a 
5 suitable host cell so that the DNA is repHcable, either as an extrachromosomal element, or by 
chromosomal integration. The term "transfection" refers to the taking up of an expression vector 
by a suitable host cell, whether or not any coding sequences are in fact expressed. 

Each of the above terms is meant to encompasses all that is described for each, 
unless the context dictates otherwise. 

10 POLYNUCLEOTIDES AND NUCLEIC ACIDS OF THE INVENTION 

The invention provides polynucleotides substantially equivalent to SEQ ID NO: 1, 
which is the cDNA encoding the polypeptide sequence, SEQ ID NO: 2. The present invention 
also provides genes corresponding to the cDNA sequences disclosed herein. The corresponding 
genes can be isolated in accordance with known methods using the sequence information 
1 5 disclosed herein. Such methods include the preparation of probes or primers from the disclosed 
sequence information for identification and/or ampUfication of genes in appropriate genomic 
libraries or other sources of genomic materials. 

The compositions of the present invention include isolated polynucleotides, 
including recombinant DNA molecules, cloned genes or degenerate variants thereof, especially 
20 naturally occurring variants such as alleUc variants, novel isolated polypeptides, and antibodies 
that specifically recognize one or more epitopes present on such polypeptides. 

The polynucleotides of the invention also include nucleotide sequences that are 
substantially equivalent to the polynucleotides recited above. Polynucleotides according to the 
invention can have at least about 80%, more typically at least about 90%, and even more 
25 typically at least about 95%, sequence identity to a polynucleotide recited above. The invention 
also provides the complement of the polynucleotides including a nucleotide sequence that has at 
least about 80%, more typically at least about 90%, and even more typically at least about 95%, 
sequence identity to a polynucleotide encoding a polypeptide recited above. The polynucleotide 
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can be DNA (genomic, cDNA, amplified, or synthetic) or RNA such as mRNA or an antisense 
RNA. Methods and algorithms for obtaining such polynucleotides are well known to those of 
skill in the art and can include, for example, methods for determining hybridization conditions 
which can routinely isolate polynucleotides of the desired sequence identities. 

5 A polynucleotide according to the invention can be joined to any of a variety of 

other nucleotide sequences by well-estabUshed recombinant DNA techniques (see Sambrook J et 
al (1989) Molecular Cloning: A Laboratory Manual Cold Spring Harbor Laboratory, NY). 
Useful nucleotide sequences for joining to polypeptides include an assortment of vectors, e.g., 
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are well known in the 

1 0 art. Accordingly, the invention also provides a vector including a polynucleotide of the invention 
and a host cell containing the polynucleotide, hi general, the vector contains an origin of 
replication functional in at least one organism, convenient restriction endonuclease sites, and a 
selectable marker for the host cell. Vectors according to the invention include expression 
vectors, repUcation vectors, probe generation vectors, and sequencing vectors. A host cell 

15 according to the invention can be a prokaryotic or eukaryotic cell and can be a unicellular 
organism or part of a multicellular organism. 

The sequences falling within the scope of the present invention are not limited to 
the specific sequences herein described, but also include allelic variations thereof Allelic 
variations can be routinely determined by comparing the sequence provided in SEQ ID NO: 1, a 

20 representative jfragment thereof, or a nucleotide sequence at least 98 % identical to SEQ ID 
NO: 1, with a sequence from another murine isolate. An allelic variation is more typically at 
least 99% identical to SEQ ID NO: 1 and even more typically 99.8% identical to SEQ ID NO: 1 . 
Furthermore, to accommodate codon variability, the invention includes nucleic acid molecules 
coding for the same amino acid sequences as do the specific ORFs disclosed herein. In other 

25 words, in the coding region of an ORF, substitution of one codon for another which encodes the 
same amino acid is expressly contemplated. Any specific sequence disclosed herein can be 
readily screened for errors by resequencing a particular fragment, such as an ORF, in both 
directions (f.e., sequence both strands). 
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The present invention further provides recombinant constructs comprising a 
nucleic acid having the sequence of SEQ ID NO: 1 or a fragment thereof. The recombinant 
constructs of the present invention comprise a vector, such as a plasmid or viral vector, into 
which a nucleic acid having the sequence of SEQ ID NO: 1 or a fragment thereof is inserted, in a 
5 forward or reverse orientation. In the case of a vector comprising one of the ORFs of the present 
invention, the vector may further comprise regulatory sequences including, for example, a 
promoter operably Unked to the ORF. Large numbers of suitable vectors and promoters are 
known to those of skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. 

10 The nucleic acid sequences of the invention are further directed to sequences 

which encode variants of the described nucleic acids. These amino acid sequence variants may 
be prepared by methods known in the art by infroducing appropriate nucleotide changes into a 
native or variant polynucleotide. There are two variables in the construction of amino acid 
sequence variants: the location of the mutation and the nature of the mutation. The amino acid 

15 sequence variants of the nucleic acids are preferably constructed by mutating the polynucleotide 
to give an amino acid sequence that does not occur in nature. In a preferred method, 
polynucleotides encoding the novel nucleic acids are changed via site-directed mutagenesis. 

USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific 
20 nucleic acid hybridization probes capable of hybridizing with naturally-occurring nucleotide 

sequences. The hybridization probes of the subject invention may be derived from the nucleotide 
sequence of SEQ ID NO: 1, fragments or complements thereof Because the corresponding gene 
is only expressed in a limited number of tissues, a hybridization probe derived from SEQ ID NO: 
1 can be used as an indicator of the presence of RNA of cell type of such a tissue in a sample as 
25 shown in Example 1 . 

Such probes may be of recombinant origin, may be chemically synthesized, or a 
mixture of both. The probe will comprise a discrete nucleotide sequence for the detection of 
identical sequences or a degenerate pool of possible sequences for identification of closely 
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related genomic sequences. Other means for producing specific hybridization probes for nucleic 
acids include the cloning of nucleic acid sequences into vectors for the production of mRNA 
probes. 

HOSTS 

5 The present invention further provides host cells genetically engineered to contain 

the polynucleotides of the invention. For example, such host cells may contain nucleic acids of 
the invention introduced into the host cell usmg known transformation, transfection or infection 
methods. The present invention still further provides host cells genetically engineered to express 
the polynucleotides of the invention, wherein such polynucleotides are in operative association 

1 0 with a regulatory sequence heterologous to the host cell which drives expression of the 
polynucleotides in the cell. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell or an 
insect cell, a lower eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic 
cell, such as a bacterial cell. Litroduction of the recombinant construct into the host cell can be 
1 5 effected by calcium phosphate transfection, DEAE, dextran mediated transfection, or 

electroporation (Davis, L. et al, Basic Methods in Molecular Biology (1 986)). The host cells 
containing one of polynucleotides of the invention, can be used in conventional manners to 
produce the gene product encoded by the isolated fragment (in the case of an ORF) or can be 
used to produce a heterologous protein under the control of an appropriate promoter region. 

20 Any host/vector system can be used to express one or more of the ORFs of the 

present invention. These include, but are not limited to, eukaryotic hosts such as HeLa cells, 
CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic host such as E. coli and B. subtilis. 
The most preferred cells are those which do not normally express the particular polypeptide or 
protein or which expresses the polypeptide or protein at low natural level. Mature proteins can 

25 be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appropriate 
promoters. Cell-free translation systems can also be employed to produce such proteins using 
RNAs derived from the DNA constructs of the present invention. Appropriate cloning and 
expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrook, et 
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al, in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New 
York (1989). 

REGULATION OF TRANSCRIPTION 

Polynucleotides of the invention and vectors capable of expressing these 
polynucleotides are useful for the regulation of transcription in cells. 

Increased expression of CARMl in cells enhances the function of nuclear receptor 
coactivators of the pl60 family including GRIPl, SRC-1, and p/Cff . CARMl expression in 
mammalian cells enhances the activity of full length GRIPl or of the C-terminal domain of 
GRIPl attached to the DNA binding domain of a heterologous protein. Increased expression of 
CARMl in cells, in conjunction with increased expression of coactivators of the GRIPl family, 
enhances the function of nuclear receptors. The enhancement by CARMl is over and above that 
achieved by the increased expression of a GRIPl-type coactivator. Thus, CARMl can serve as a 
coactivator for nuclear receptors. 

The activity of other transcriptional activator proteins that rely on GRIPl-type 
coactivators will be enhanced by increased expression of CARMl. Examples of other 
transcriptional activator proteins that may use GRIPl-type coactivators are other nuclear 
receptors, API, and STATs (Glass CK et al. (1997) Curr. Opin. Cell Biol 9:222-232; Kamei Y 
et al (1996) Cell 85:403-414; Kokus E et al (1998) Science 279:703-707; Yao T-P et al (1996) 
Proc. Natl Acad. Set USA 93:10626-10631). 

CARMl polynucleotides or polypeptides can also be used in conjunction with 
other transcriptional activating molecules to increase transcription of a nuclear receptor- 
dependent gene. In one embodiment, CARMl is expressed simultaneously with a histone acetyl 
transferase (HAT). Transcription of a gene under the control of a nuclear receptor is 
synergistically enhanced by the presence of CARMl and a HAT. 
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GENE THERAPY 



Polynucleotides of the present invention can also be used for gene therapy for the 
treatment of disorders which are mediated by CARMl, certain hormones, such as those that act 
as Uganda for nuclear hormone receptors, or by nuclear hormone receptors. Such therapy 

5 achieves its therapeutic effect by introduction of the appropriate CARMl polynucleotide {e.g., 
SEQ ID NO: 1) which contains a CARMl gene (sense or antisense), into cells of subjects having 
the disorder to increase or decrease CARMl activity in the subjects' cells. Delivery of sense or 
antisense CARMl polynucleotide constructs can be achieved using a recombinant expression 
vector such as a chimeric virus or a colloidal dispersion system. An expression vector including 

10 the CARMl polynucleotide sequence may be introduced to the subject's cells ex vivo after 

removing, for example, stem cells from a subject's bone marrow. The cells are then reintroduced 
into the subject, {e.g., into subject's bone marrow). 

Various viral vectors which can be utilized for gene therapy as taught herein 
include adenovirus, herpes virus, vaccinia, or, preferably, an RNA virus such as a retrovirus. 

1 5 Preferably, the retroviral vector is a derivative of a murine or avian retrovirus. Examples of 
retroviral vectors in which a single foreign gene can be inserted include, but are not hmited to: 
Moloney murine leukemia virus (MoMuLV), Harvey murine sarcoma virus (HaMuSV), murine 
mammary tumor virus (MuMTV), Rous Sarcoma Virus (RSV), and gibbon ape leukemia virus 
(GaLV), which provides a broader host range than many of the murine viruses. A number of 

20 additional retroviral vectors can incorporate multiple genes. All of these vectors can transfer or 
incorporate a gene for a selectable marker so that transduced cells can be identified and selected 
for. By inserting a CARMl sequence of interest into the viral vector, along with another gene 
which encodes the ligand for a receptor on a specific target cell, for example, the vector is now 
target specific. Preferred targeting is accompUshed by using an antibody to target the retroviral 

25 vector. Those of skill in the art will know of, or can readily ascertain without undue 

experimentation, specific polynucleotide sequences which can be inserted into the retroviral 
genome to target a specific retro vu^l vector containing the CARMl sense or antisense 
polynucleotide. 
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Since recombinant retroviral vectors usually are defective, they require assistance 
to produce infectious vector particles. This assistance can be provided, for example, by using 
helper cell lines that contain plasmids encoding all of the structural genes of the retrovirus under 
the control of regulatory sequences within the LTR. These plasmids are missing a nucleotide 

5 sequence which enables the packaging mechanism to recognize an RNA transcript for 

encapsidation. Helper cell lines which have deletions of the packaging signal include but are not 
hmited to PSI.2, PA317 and PA12, for example. These cell lines produce empty virions, since 
no genome is packaged. If a retroviral vector in which the packaging signal is intact, but the 
structural genes are replaced by other genes of interest is introduced into such cells, the vector 

1 0 will be packaged and vector virions produced. 

Since CARMl promotes the action of nuclear receptors, CARMl or vectors 
expressing CARMl may be useful as agonists to stimulate processes mediated by nuclear 
receptors. For example, glucocorticoids are used as anti-inflammatory agents. Gene therapy 
appUcations of CARMl may enhance the anti-inflanamatory effects of glucocorticoids and could 
15 thus enhance the glucocoticoids' therapeutic effectiveness or reduce the concentration of 
glucocorticoids required to provide the desired anti-inflammatory effects. 

The CARMl nucleotide and predicted amino acid sequence, combined with the 
functional domains of CARMl, can be used to design modified forms of CARMl that lack the 
methyltransferase activity but retain the ability to bind GRIPl-type coactivators. For example, 

20 we have shown that mutations in the region of CARMl that contains the methyltransferase 
activity produce such a modified CARMl protein. Also, a fragment of CARMl protein that 
contains the GRIPl-binding function but lacks the methyltransferase region will also have the 
same properties. Such forms of CARMl have a "dominant negative" effect on nuclear receptor 
function; i.e., when expressed m cells, these dominant negative forms of CARMl reduce the 

25 activity of nuclear receptors. This approach is effective in cells that natiirally express CARMl or 
a functionally equivalent protein from the native endogenous gene. The dominant negative 
variant of CARMl interferes with the function of the endogenous CARMl (or functionally 
equivalent protein) as follows: When nuclear receptors bind to a target gene, they recruit a 
GRIPl-type coactivator, which would normally recruit CARMl . However, if the dominant 
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negative form of CARMl is expressed in higher levels than the endogenous intact CARMl, then 
the dominant negative CARMl is more likely to bind to GRIPl instead of the endogenous active 
CARMl. The recruited dominant negative form of CARMl fails to activate gene expression 
(since it has no methyltransferase), and also blocks the endogenous intact CARMl protein from 

5 binding to GRIPl and carrying out its ftinction. Thus, the expression of the dominant negative 
CARMl reduces the nuclear receptor's ability to activate gene expression by interfering with the 
ftmction of endogenous CARMl. The same forms of CARMl should have a dominant negative 
effect on any franscription factor whose ftmction is normally enhanced by intact CARMl . We 
have demonstrated that a CARMl mutant (CARMl VLD mutant), in which the amino acids 

10 valinel89, leucinel90, and aspartic acidl91 (V189A/L190A/D191 A) have all been changed to 
alanine, lacks methylfransferase activity, lacks coactivator activity, and inhibits nuclear receptor 
ftmction in conditions where GRDPl-type coactivators are limiting. 

Examples of specific uses for such antagonistic reagents are in the freatment of 
breast cancer and prostate cancer. Most breast cancers, at least initially, rely on estrogen for 

1 5 growth; and most prostate cancers, at least initially, depend on androgens for growth. Since 
CARMl promotes estrogen and androgen receptor action, antagonists of CARMl or other 
methylfransferases may block or partially block the growth promoting effects of the hormones 
estrogen and androgen on these tumors. These antagonists may serve as effective 
chemotherapeutic agents, either when used alone or when used in combination with other types 

20 of tteatments. 

POT.YPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising the amino acid sequence of SEQ ID NO: 2 or fragments thereof. 

The invention also relates to methods for producing a polypeptide comprising 
25 growing a culture of the cells of the invention in a suitable culture medium, and purifying the 
protein from the culture. For example, the methods of the invention include a process for 
producing a polypeptide in which a host cell containing a suitable expression vector that includes 
a polynucleotide of the invention is cultured under conditions that allow expression of the 
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encoded polypeptide. The polypeptide can be recovered from the culture, conveniently from the 
culture medium, and further purified. Preferred embodiments include those in which the protein 
produced by such process is a fiiU length or mature form of the protein. 

The invention also provides a polypeptide including an amino acid sequence that 
5 is substantially equivalent to SEQ ID NO: 2. Polypeptides according to the invention can have at 
least about 80%, and more typically at least about 90%, and even more typically 95 sequence 
identity to SEQ ID NO: 2. 

The present invention further provides isolated polypeptides encoded by the 
nucleic acid fragments of the present invention or by degenerate variants of the nucleic acid 
10 fragments of the present invention. By "degenerate variant" is intended nucleotide Augments 
which differ from a nucleic acid fragment of the present invention {e.g., an ORF) by nucleotide 
sequence but, due to the degeneracy of the genetic code, encode an identical polypeptide 
sequence. Preferred nucleic acid fragments of the present invention are the ORFs that encode 
proteins. 

15 Methodologies known in the art can be utilized to obtain any one of the isolated 

polypeptides or proteins of the present invention. At the simplest level, the amino acid sequence 
can be synthesized using commercially available peptide synthesizers. This is particularly useful 
in producing small peptides and fragments of larger polypeptides. Fragments are useful, for 
example, in generating antibodies against the native polypeptide. In an alternative method, the 

20 polypeptide or protein is purified from cells which naturally produce the polypeptide or protein. 
One skilled in the art can readily follow known methods for isolating polypeptides and proteins 
to obtain one of the isolated polypeptides or proteins of the present invention. These include, but 
are not limited to, immunochromatography, HPLC, size-exclusion chromatography, ion- 
exchange chromatography, and immuno-affinity chromatography. See, e.g.. Scopes, Protein 

25 Purification: Principles and Practice , Springer-Verlag (1994); Sambrook, et al, in Molecular 
Cloning: A Laboratory Manual {supra); Ausubel et al. Current Proto cols in Molecular Biology 
{supra). 
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The polypeptides and proteins of the present invention can alternatively be 
purified from cells which have been altered to express the desired polypeptide or protein. One 
skilled in the art can readily adapt procedures for introducing and expressing either recombinant 
or synthetic sequences into eukaryotic or prokaryotic cells in order to generate a cell which 
5 produces one of the polypeptides or proteins of the present invention. The purified polypeptides 
can be used in in vitro binding assays which are well known in the art to identify molecules 
which bind to the polypeptides. 

The protein may also be produced by known conventional chemical synthesis. 
Methods for constructing the proteins of the present invention by synthetic means are known to 

10 those skilled in the art. For polypeptides more than about 100 amino acid residues, a number of 
smaller peptides will be chemically synthesized and ligated either chemically or enzymatically to 
provide the desired full-length polypeptide. The synthetically-constructed protein sequences, by 
virtue of sharing primary, secondary or tertiary structural and/or conformational characteristics 
with naturally occurring proteins may possess biological properties in common therewith. Thus, 

1 5 they may be employed as biologically active or immunological substitutes for natural, purified 
proteins in screening of therapeutic compounds and in immunological processes for the 
development of antibodies. 

The proteins provided herein also include proteins characterized by amino acid 
sequences substantially equivalent to those of purified proteins but into which modification are 

20 naturally provided or deliberately engineered. For example, modifications in the peptide or DNA 
sequences can be made by those skilled in the art using known techniques. Modifications of 
interest in the protein sequences may include the alteration, substitution, replacement, insertion 
or deletion of a selected amino acid residue in the coding sequence. For example, one or more of 
the cysteine residues may be deleted or replaced with another amino acid to alter the 

25 confomiation of the molecule. Techniques for such alteration, substitution, replacement, 

insertion or deletion are well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584). 
Preferably, such alteration, substitution, replacement, insertion or deletion retains the desired 
activity of the protein. 
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Other fragments and derivatives of the sequences of proteins which would be 
expected to retain protein activity in whole or in part and may thus be useful for screening or 
other immunological methodologies may also be easily made by those skilled in the art given the 
disclosures herein. Such modifications are intended to be encompassed by the present invention. 

5 The protein of the invention may also be expressed in a form that will facilitate 

purification. For example, it may be expressed as a fiision protein, such as those of maltose 
binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin (TRX). Kits for 
expression and purification of such fiision proteins are commercially available from New 
England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and Invitrogen (Carlsbad, CA), 
1 0 respectively. The protein also can be tagged with an epitope and subsequently purified by using 
3 a specific antibody directed to such epitope. One such epitope CTlag") is commercially available 
from Kodak (New Haven, Conn.). 

W Our knowledge of CARMl should make it possible to design and screen drugs 

'^^1 that block the methyltransferase activity of CARMl . The CARMl protein can in principle be 

U 15 used for X-ray crystallographic, or other structural studies, to determine the 3 dimensional 

! structure of the active site (including the binding sites for S-adenosylmethionine and the protein 

U1 substrate which accepts methyl groups) of the methyltransferase region of CARMl . Once 

yi determined, this structure can be used for rational drug design, to design drugs to block the 

substrate binding and activate sites of CARMl. These or randomly selected candidates can be 

20 screened using the methyltransferase activity assays we have developed. 

There are other protein arginine methyltransferases related to CARMl (Lin, W-J. 
et al 1996; Gary, J.D. et al 1996; Aletta, J.M. et al 1998), and there maybe others which are 
unknown at this time. Some of these other protein arginine methyltransferases and possibly even 
some other types of protein methyltransferases {e.g., lysine methyltransferases and carboxyl 
25 methyltransferases (Aletta, J.M. et al 1998) may also be involved in gene regulation by a 

mechanism similar to that of CARMl. Our knowledge of the CARMl sequence and mechanism 
provides the tools to search for related genes and proteins and the knowledge to determine 
whether any of these other methyltransferases are involved in regulation of transcription. 

26 



ANTIBODIES 



Another aspect of the invention is an antibody that specifically binds the 
polypeptide of the invention. Such antibodies can be either monoclonal or polyclonal antibodies, 
as well as fragments thereof and humanized forms or fully human forms, such as those produced 
5 in transgenic animals. The invention further provides a hybridoma that produces an antibody 
according to the invention. Antibodies of the invention are useful for detection and/or 
purification of the polypeptides of the invention. 

Protein of the invention may also be used to immunize animals to obtain 
polyclonal and monoclonal antibodies that react specifically with the protein. Such antibodies 

10 may be obtained using either the entire protein or fragments thereof as an inmiunogen. The 
peptide immunogens additionally may contam a cysteine residue at the amino or carboxyl 
terminus, and are conjugated to a hapten such as keyhole limpet hemocyanin (KLH). Methods 
for synthesizing such peptides are known in the art, for example, as in R. P. Merrifield, J. Amer. 
Chem. Soc. 85, 2149-2154 (1963); J. L. Krstenansky, et al, FEES Lett 211, 10 (1987). 

1 5 Monoclonal antibodies binding to the protein of the invention may be useful diagnostic agents 
for the immunodetection of the protein. Neutralizing monoclonal antibodies binding to the 
protein may also be useful therapeutics for conditions associated with excess production or 
accumulation of the protein. In general, techniques for preparing polyclonal and monoclonal 
antibodies as well as hybridomas capable of producing the desired antibody are well known in 

20 the art (Campbell, A.M., Monoclonal Antibodies Technology: Laboratorv Techniques in 

Biochemistrv and Molecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands 
(1984); St. Groth et al, J. Immunol 35:1-21 (1990); Kohler and Milstein, Nature 256:495-497 
(1975)). Other useful techniques include the trioma technique and the human B-cell hybridoma 
technique (Kozbor et aL, Immunology Today 4:72 (1983); Cole et al, in Monoclonal Antibodies 

25 and Cancer Therapy. Alan R. Liss, Inc. (1985), pp. 77-96). 

Any animal (rabbit, etc.) which is known to produce antibodies can be immunized 
with a peptide or polypeptide of the invention. Methods for immunization are well known in the 
art. Such methods include subcutaneous or intraperitoneal injection of the polypeptide. One 
skilled in the art will recognize that the amount of the protein encoded by the ORF of the present 
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invention used for immunization will vary based on the animal v^hich is immunized, the 
antigenicity of the peptide and the site of injection. The protein that is used as an immunogen 
may be modified or administered with an adjuvant to increase the protein's antigenicity. 
Methods of increasing the antigenicity of a protein are well known in the art and include, but are 
5 not limited to, coupling the antigen with a heterologous protein (such as globulin or p- 
galactosidase) or through the inclusion of an adjuvant during immunization. 

For monoclonal antibodies, spleen cells from the immunized animals are 
removed, fused with myeloma cells, such as SP2/0-Agl4 myeloma cells, and allowed to become 
monoclonal antibody producing hybridoma cells. Any one of a number of methods well known 
10 in the art can be used to identify the hybridoma cell which produces an antibody with the desired 
characteristics. These include screening the hybridomas with an ELIS A assay, western blot 
analysis, or radioimmunoassay (Lutz et aL, Exp. Cell Research, 175:109-124 (1988)). 

Hybridomas secreting the desired antibodies are cloned and the class and subclass 
is determined using procedures known in the art (Campbell, A.M., Monoclonal Antibody 
1 5 Technology: Laboratory Techniques in Biochemistry and Molecular BioloRv , Elsevier Science 
Publishers, Amsterdam, The Netherlands (1984)). Techniques described for the production of 
single chain antibodies (U.S. Patent 4,946,778) can be adapted to produce single chain antibodies 
to proteins of the present invention. 

For polyclonal antibodies, antibody containing antiserum is isolated from the 
20 immunized animal and is screened for the presence of antibodies with the desired specificity 
using one of the above-described procedures. The present invention ftirther provides the above- 
described antibodies in detectably labeled form. Antibodies can be detectably labeled through 
the use of radioisotopes, affinity labels (such as biotin, avidin, eta% enzymatic labels (such as 
horseradish peroxidase, alkaline phosphatase, etc) fluorescent labels (such as FITC or 
25 rhodamine, etc.), paramagnetic atoms, etc. Procedures for accompHshing such labeling are well- 
known in the art, for example, see Stemberger, L.A. et al, J. Histochem. Cytochem. 18:315 
(1970); Bayer, E.A. etaL, Metk Enzym. 62:308 (1979); Engval, E. etal, Immunol 109:129 
(1972); Coding, J.W. J. Immunol Metk 13:215 (1976). 
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In diagnostic uses, it is possible that some medical conditions may derive from 
abnormal forms or levels of expression of CARMl or other methyltransferases. Thus, nucleic 
acid and antibody reagents derived from CARMl or other methyltransferases may be used to 
screen humans for such abnormalities. Similarly, there may be different alleles of CARMl or 
5 other methylfransferases that predispose carriers to be more susceptible to specific drugs or 

diseases. The CARMl reagents can be used to defme such alleUc variations and subsequently to 
screen for them. 

Antibodies of the invention can also be generated that specifically recognize 
substrates that have been methylated by CARMl . For example, CARMl methylates residues 
10 arg2, argl7 and arg26 of histone H3. Antibodies to peptides containing methylated arginines at 
these, or other positions of CARMl methylation, are useftil for studying the role of methylation 
in gene expression. 

METHYLTRANSFERASE ACTIVITY 

CARMl can transfer one or more methyl groups from S-adenosyhnethionine to an 
1 5 arginine residue in proteins and in synthetic peptides. Appropriate substrate proteins include 
histones. CARMl can transfer methyl groups from S-adenosylmethionine to one or more 
arginine residues in histone H3, producing monomethyl and asymmetrically dimethylated N°,N°- 
dimethylarginine residues in histone H3. The CARMl VLD (SEQ ID NO:3) mutant lacks 
methyltransferase activity for both histone H3 and the synthetic peptide subsfrates and lacks 
20 coactivator activity. 

SUBSTRATES OF CARMl 

While the identity of additional proteins (oilier than histone H3) that CARMl 
methylates remains unknown, we have estabUshed a procedure for identifying proteins that are 
methylated by CARMl, by incubating candidate substrate proteins or protein fractions or extracts 
25 with recombinant CARMl and S-adenosyhnethionine and then analyzing the products by 

chromatography or electrophoresis. The purified protein can be sequenced to leam its identity. 
The yeast two hybrid system, used to discover CARMl, also should be usefiil for defining 
proteins that bind to CARMl methyltransferase and thus are possible subsfrates. Once 
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identified, these methylation substrates of CARMl should be useful as reagents for studying the 
role and mechanism of methylation in gene regulation. They also serve as additional sites of 
intervention for blocking or enhancing gene expression. This is accomplished by increasing the 
expression of the protein substrate or by reducing expression of the protein substrate, for example 
5 by using antisense techniques or by expressing altered forms of the protein substrate which have 
a dominant negative effect and thus block the function of the endogenous native protein 
substrate. 

Because protein methylation is involved in regulation of gene transcription, a 
mechanism for demethylation of the same proteins likely exists. Histone H3 is a substrate for 

10 CARMl , and we have described methods for identifying other protein substrates of CARMl 
above. These methylated proteins can serve as the basis for identifying demethylating enzymes, 
hi such a method, a methylated protein preparation is incubated with cell extracts, fractions of 
cell extracts or with candidate proteins. Demethylation can be monitored by release of 
radioactivity if the methylated protein is prepared with radioactively labeled S-adenosyl- 

1 5 methionine. Demethylation can also be monitored by chromatographic changes, using 

techniques such as ion-exchange chromatography; by mass spectrometry; by spectroscopic 
techniques such as fluorescence spectropscopy; or by immunoassays with antibodies raised 
against the methylated or non-methylated forms of the protein. Once identified, these 
demethylating enzymes can be used as the basis for developing reagents to enhance or block 

20 demethylation. Blocking or enhancing demethylation should have the opposite effect from 
blocking or enhancing methylation by CARMl . 

SCREENING OF CARMl INHBITORS 

Inhibitors of CARMl can be discovered using the methods of the invention that 
act through a variety of mechanisms. In one embodiment, molecules are screened for their ability 
25 to inhibit CARMl methyltransferase activity. Methyltransferase activity can be determined using 
any of the assays described herein, or other suitable biochemical assays. For example, in one 
embodiment a substrate protein, such as histone H3 is incubated with a candidate inhibitor 
molecule or pool of molecules along with CARMl (SEQ ID NO: 2) and radioactively labeled S- 
adenosylmethionine. The degree of radioactive labeling of the target histone is measured by 
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separating the labeled protein from the free S-adenosylmethionine. Separation may be effected 
by chromatography or by using a low molecular weight cut-off membrane, through which the 
free S-adenosylmethionine passes, but the labeled protein is retarded. The activity of CARMl is 
then compared in the presence and absence of the candidate inhibitor. 

5 CARMl inhibitors can also be discovered that prevent interaction of CARMl 

with a coactivator such as GRIPl . The disclosed two-hybrid assays for measuring the binding 
interaction between coactivators and CARMl are also suitable for use as a screening system to 
identify compoimds that can block binding of CARMl to GRIPl -type coactivators. 

In one embodiment, CARMl (SEQ ID NO: 1) or a fragment thereof, is expressed 
10 in a host cell as a fusion with either a DNA binding domain (DBD) or with a transcriptional 
activation domain (AD). DNA binding domains are well known in the art, and can be chosen 
from any DNA binding protein or transcription factor. In one embodiment, CARMl is expressed 
fused with the DNA binding domain of Gal4. In another embodiment, CARMl is expressed 
instead fiised to a transcriptional activation domain from Gal4. 

15 A GRIPl -type coactivator, or a fragment thereof, is expressed as a fusion with 

either a DNA bindmg domain or with a transcriptional activation domain, but not with the 
domain type chosen for CARMl . If CARMl is fused with a DNA binding domain, then the 
GRIPl -type coactivator domain must be fused with a transcriptional activating domain. 

In such a method, a reporter gene construct is also provided. The reporter gene 
20 construct comprises a reporter gene and a promoter region. Reporter genes encode a protein that 
can be directly observed or can be indirectly observed through an enzymatic activity or through 
immunogenic detection methods. Directly observable proteins can be fluorescent proteins, such 
as the green fluorescent protein (GFP) of Aequorea. Indirectly observable proteins commonly 
possess an enzymatic activity capable of affecting a chromogenic or fluorogenic change in a 
25 specific substrate. Such proteins include p-lactamase, luciferase and ^-galactosidase. Reporter 
gene expression can also be monitored with antibodies directed towards the gene product, or by 
measuring the RNA levels produced. 
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In the two hybrid system, the interaction of C ARM 1 -AD hybrid protein with the 
GRIPl-DBD hybrid protein leads to the expression of the reporter gene, and thus, the expression 
of the reporter gene serves as an indication that CARMl and GRIPl can bind to each other. 



Upon consideration of the present disclosure, one of skill in the art will appreciate 
5 that many other embodiments and variations may be made in the scope of the present invention. 
Accordingly, it is intended that the broader aspects of the present invention not be limited to the 
disclosure of the following examples, but rather only to the scope of the appended claims. 

EXAMPLE 1 

Isolation of murine CARMl cDNA 

10 A 3.2-kb partial CARMl cDNA clone with an open reading frame of 606 amino 

acids (CARMl(3-608)), followed by a 1.4 kb 3*-untranslated region and a poly A sequence, was 
isolated from a mouse 1 7-day embryo library by using the yeast two-hybrid system as described 
previously (Hong, H. et al 1996), The EcoRI library (Clontech) was in vector pGAD 10 which 
has a leu2 marker gene; the bait was GRIPlc (GRIP (1 122-1462)) in vector pGBT9 (Clontech) 

15 which has a trpl marker gene. Further screening of a lambda phage library of mouse 1 1-day 

embryo cDNA clones (Stratagene) identified additional S'-sequences and allowed construction of 
a putative full length coding region for CARMl (608 amino acids), Amino acids 143-457 of 
CARMl share 30% identity with hPRMTl and yODPl. A clone coding for a C4erminal 
fragment of a-actinin was isolated in the same yeast two hybrid screen with pGBT9.GRIPlc. 

20 A BLAST search of the GenBank database (Altschul, S.F, et al (1990) J. Mol 

Biol 215:403-410) indicated that this coding region represents a novel protein, whose central 
region shares extensive homology with a family of proteins with arginine-specific protein 
methyltransferase activity (Figure 1). We therefore named the new protein Coactivator 
Associated arginine (R) Methyltransferase 1 (CARMl). RNA blot analysis indicated that the 

25 CARMl cDNA represents a 3.8-kb mRNA which is expressed widely, but not evenly, in adult 
mouse tissues including in heart, brain, liver, kidney, and testis; testis also contains a 
homologous 4.1-kb RNA species (Figure 2). Lower expression was observed in spleen, lung, 
and skeletal muscle. Northern blot analysis was performed as shown in Figure 2 with a 0.6-kb 



32 



BamHI cDNA fragment (representing CARMl codons 3-198) and with RNA from multiple 
tissues as described previously (Hong et al 1997). 

EXAMPLE 2 
Construction of Plasmids 

5 Mammalian cell expression vector: pSG5.HA was constructed by inserting a 

synthetic sequence coding for a translation start signal, HA tag, EcoRI site, and Xhol site into the 
EcoRI-BamHI site of pSG5 (Stratagene), which has SV40 and T7 promoters. The original EcoRI 
site is destroyed by this insertion, but the BamHI site is preserved, leaving a multiple cloning site 
after the HA tag containmg EcoRI, Xhol, BamHI, and Bgin sites. The following protein coding 

1 0 regions were cloned into pSGS.HA, in frame with the HA tag, using the indicated insertion sites: 
GRIPl (5-1462) (full length) and CARMl (3-608) (full length) at the EcoRI site; GRIPl (5-765) 
at the EcoRI-XhoI site; GRIPl (730-1 121) and GRIPl (1 121-1462) were EcoRI-Sall fragments 
inserted at the EcoRI-XhoI site; SRC-la (1-1441) (full length) was a Smal-Sall fragment inserted 
at the EcoRI site, which was blunted by filling with Klenow polymerase, and the Xhol site. 

15 Expression vector for Gal4DBD-GRIPlc was constructed by inserting an EcoRI-BglH fragment 
coding for GRIPl (1 122-1462) into pM (Clontech). Vectors for GST fiision proteins were 
constructed in pGEX-4Tl (Pharmacia): for GST-CARMl the original 3.2-kb EcoRI fragment 
from pGADlO.CARMl was inserted; for GST-GRIPlc (ammo acids 1 122-1462) a EcoRI-Sall 
fragment was inserted. Yeast expression vectors for Gal4DBD fused to various GRJPl 

20 fragments were constructed by inserting EcoRI-Sall fragments into pGBT9. The GRIPlcA19 and 
CARMl VLD mutations were engineered with the Promega Gene Editor Kit. Constructions of 
all the above plasmids was described previously (Chen, D. et al (1999) Science 284:2174-2177). 

EXAMPLES 
Binding interactions of CARML 

25 This example demonstrates that CARMl interacts with GRIPL The binding of 

GRJPlc to CARMl observed in the yeast two-hybrid system was confirmed in vitro ^ by 
incubating glutathione S-transferase (GST) fusion proteins attached to glutathione agarose beads 
with labeled proteins or protein fragments translated in vitro. GST-CARMl boimd GRIPlc 



33 



(amino acids 1 122-1462) but not protein fragments representing GRIPl amino acids 5-765 or 
730-1 121 (Figure 3A). Conversely, GST-GRIPlc bound CARMl and the VLD to AAA mutant 
of CARMl (Figure 3B). GST-CARMl not only bound GRIPl but also the other two members 
of the pl60 coactivator family, SRC-la and ACTR (Figure 3A). Thus, Figure 3 shows the 
5 binding of CARMl to the C-terminal region of pi 60 coactivators. GST fusion proteins of 
CARMl or the indicated GRIPl fragments produced inE, coli strain BL21 (Stratagene), were 
bound to glutathione-agarose beads and incubated with labeled full length CARMl or pi 60 
coactivators or GRIPl fragments translated in vitro from vector pSG5.HA-CARMl, 
pSG5.HA-GRIPl, pSG5.HA-SRC-la (Chen, D. et al 1999), or pCMX.ACTR (Chen et al 1997); 
10 bound labeled proteins were eluted and analyzed by SDS polyacrylamide gel electrophoresis as 
described previously (Hong et al 1996). A mutant form of CARMl with the triple dmmo acid 
substitution (VLD changed to AAA) shown in Figure 1 still retains the ability to bind to the C- 
terminal fragment of GRIPL 

The binding site for CARMl in GRIPlc was further mapped by using the yeast 
15 two hybrid system. When GRIPlc was bisected between amino acids 1210 and 121 1, the 
N-terminal fragment fails to bind CARMl , while the C-terminal fragment retains binding 
activity; thus GRIP 121 1-1462 is sufficient for CARMl binding while amino acids 1 121-1210 
are neither necessary nor sufficient (Figure 4). When GRJPlc was bisected between amino acids 
1305 and 1306, neither fragment bound CARMl, indicating that sequences near this boundary 
20 were important for CARMl binding. This conclusion was supported by the finding that deletion 
of amino acids 1291-1309 (GRIPlcA19 mutant), which are highly conserved among pl60 
proteins (Anzick, S.L. et al 1997), eliminate CARMl binding. The smallest GRIPl fragment 
that binds to CARMl is the fragment from 121 1-1350. Controls with a-actinin, another protein 
found to bind GRIPlc in the yeast two hybrid screen, had a different pattem of binding to the 
25 GRIPl fragments and provided positive and negative controls. We conclude that CARMl binds 
to the C-terminal region of GRIPl defmed by amino acids 121 1-1350 and that a highly conserved 
stretch of 19 amino acids (1291-1309) is important for CARMl binding. 
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EXAMPLE 4 

Enhancement of GRIP 1 andNR function by secondary coactivator CARMl. 

This example demonstrates that CARMl expression in mammaUan cells enhances 
the transcriptional activation activity of GRIPlc fused to the DBD of Gal4 protein, hi transient 
5 transfections of CV-1 cells, Gal4DBD-GRIPlc weakly activates expression of a reporter gene 
with a promoter containing Gal4 bindmg sites; co-expression of CARMl enhances reporter gene 
activity in a dose-dependent manner and provides a maximum stimulation of more than 10-fold 
(Figure 5). CARMl expression has little if any effect on the activity of Gal4DBD alone (Figure 
6A). CARMl also enhances the activity of full length GRIPl fused to Gal4DBD. 

10 CARMl also enhances GRIPl's coactivator fUnction for nuclear receptors (NR). 

When the androgen receptor, estrogen receptor, and thyroid hormone receptor are expressed in 
CV-1 cells by transient transfection, their abilities to activate transcription of a reporter gene 
carrying appropriate hormone response elements in the promoter are hormone dependent (Figure 
6B, lanes a & b). Co-expression of GRIPl from a co-transfected plasmid causes a 2 to 27-fold 

15 enhancement of reporter gene expression by the hormone-activated NR (lane d). These activities 
are enhanced 2 to 4-fold more by co-expression of CARMl with the NR and GRIPl (lane e). 
However, in the absence of exogenous GRIP 1, CARMl has little or no effect on the activity of 
the NR (lane c). Co-expression of NR, GRffl, and CARMl in the absence of hormone produces 
extremely low reporter gene activities equivalent to those seen with NR alone in the absence of 

20 hormone (lane f). A similar enhancement of NR fimction by CARMl is observed when SRC-1 a 
or ACTR (two other GRIPl related coactivators) is substituted for GRDPl in a similar 
experiment. The fact that CARMl's ability to enhance NR activity depends on co-expression of 
exogenous GRIPl is consistent with a model whereby CARMl interacts with NRs indirectly, 
through a pl60 coactivator, rather than directly (Figure 7). It also suggests that in the transient 

25 transfection assays, the expression of exogenous NRs renders the levels of endogenous pl60 
coactivators limiting, so that the effects of exogenous CARMl expression can only be observed 
when additional pl60 coactivators are also expressed. We conclude that CARMl acts as a 
secondary coactivator for NRs by binding to and mediating or enhancing the activity of the pl60 
primary coactivators. 
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EXAMPLES 
Histone methyltransferase activity ofCARMl 



This example shows that CARMl is a protein arginine methyltransferase. The 
homology between CARMl and arginine-specific protein methyltransferases includes sequences 
5 that are highly conserved throughout the family and are believed to be important for 

methyltransferase activity (Figure 1). We compared the methyltransferase activities of GST 
fusion proteins of CARMl and a related mammalian enzyme, Protein arginine (R) 
MethylTransferase 1 (PRMTl) (Lin, W-J. etal. 1996), for various substrates, using 
S-adenosylmethionine labeled in the donor methyl group. Mixed histones are good substrates for 

10 both enzymes (Figure 8). Gel electrophoresis and autoradiography of the methylated histone 
products, and tests with purified individual histone species, indicate that CARMl methylates 
histones H3 and H2a, while PRMTl methylates histones H4 and H2a (Figure 8B). Both 
enzymes methylate histone 2a in the absence of other histones but not in the histone mixture, 
suggesting that hetero-oligomerization of the histones may render histone 2a inaccessible to 

15 methylation (Figure 8B). The positions of the small amounts of labeled products in the histone 
H2b lanes for CARMl and PRMTl suggest that these products are minor amounts of H3 and H4 
contaminating the H2b preparation. The specific activities of CARMl and PRMTl with the 
mixed histone substrate are very similar (Table 1). Our result for PRMTl is different from one 
in a previous report (Gary and Clarke 1998), that PRMTl methylates histone H2b but none of the 

20 other core histones. RNA binding protem hnRNPAl is a good substrate for PRMTl, as shown 
previously (Lin W-J, et al 1996), but is not methylated by CARMl (Figure 8C). Both enzymes 
methylate the glycine-rich Rl peptide substrate (SEQ ID NO: 4: GGFGGRGGFG-NH2), which 
was previously shown to be a good substrate for PRMTl and other protein arginine 
methyltransferases (Lin, W-J. et al 1996; Najbauer, J. et al 1993). However, this peptide is a 

25 relatively poor substrate for CARMl ; the specific activity of GST-PRMTl for the Rl peptide is 
approximately 100 times higher than that of GST-CARMl (Table 1). CARMl fails to methylate 
the same peptide with lysine substituted for the arginine residue, demonstrating its specificity for 
arginine. 
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Table 1. Relative methyltransferase activities of GST-CARMl and GST-PRMTL 
Methyltransferase reactions (50 were carried out as described in Figure 8 at enzyme 
concentrations of 0.03-0.05 mg/ml Reactions were stopped by addition of 25 jul of 1.5% (v/v) 
trifluoroacetic acid (TFA), 15% (v/v) acetonitrile in water and subjected to reversed-phase HPLC 
5 as described (Najbauer et al 1 993) to separate the substrate from unreacted 

S-adenosylmethionine. For the histone methylation, TFA in the HPLC solvents was increased to 
0.3% (v/v) and the gradient was modified to accommodate the more retentive behavior of the 
histones. 



Substrate 


Methyltransferase specific activity (pmol/min/mg) 




GST-CARMl 


GST-PRMTl 


Rl peptide (120 nM) 

SEQIDN0:4 

GGFGGRGGFG-NH2 


21.6-54.5" 


3,070 


Kl peptide (120 fJVI) 

SEQIDN0:5 

GGFGGKGGFG-NH2 


0.7 


not determined 


mixed histones (2.7 mg/ral) 
(calf thymus) 


971 


1,180 



^Result of two separate determinations using different preparations of GST- 

10 CARMl. 

EXAMPLE 6 

Identification of the methylated amino acids produced in histone H3 by CARMl 

Histone H3 was incubated for 60 min at 30**C in a 100 |Lil methylation reaction as 
described in Figure 8, containing 0.024 mg/ml GST-CARMl and 0.63 mg/ml H3. The reaction 
15 was stopped with 25 |il of 3% (v/v) trifluoroacetic acid (TFA), 15% (v/v) acetonitrile, and 100 |xl 
was injected into a 3 cm x 4.6 mm RP-300 reversed-phase guard column (Perkin 
Elmer-Brownlee) equilibrated with 80% solvent A (0.3% TFA in water) and 20% solvent B 
(0.3% TFA in acetonitrile). Methylated H3 was separated from unreacted S-adenosylmethionine 



37 



using a gradient of 20-80% solvent B over 5 min at a flow rate of 1 .0 ml/min. H3 eluted as a 
broad complex peak detected by monitoring absorbance at 214 nm. The H3 pool was reduced to 
dryness in a vacuum centrifuge and then subjected to acid hydrolysis in 6 N HCl at 1 12 °C for 20 
h. A portion of the hydrolyzate was derivatized with o-phthaldialdehyde (Jones, B.N., Methods 
5 of Protein Microcharacterization, J.E. Shively, Ed. (Humana Press, Cliflon, NJ, 1986), p. 337) 
and injected into a 10 cm x 4.6 mm Rainin Microsorb 80OPA-C3 column fitted with a guard 
module and equilibrated with 95% solvent A (50 mM Na-acetate, pH 5.9: methanol: 
tetrahydrofuran, 79:20:1) and 5% solvent B (50 mM Na-acetate, pH 5.9:methanol, 20:80) Elution 
was carried out with a linear gradient of 5-40% B over 20 min at a flow rate of 1.0 ml/min. 

1 0 Radioactivity in the fractions was determined by liquid scintillation counting, and peak identity 
was determined by comparison to derivatized standards including the three major forms of 
methylarginine and methyllysine. hi addition, another portion of the acid hydrolyzate was 
subjected to ascending chromatography on thin layers on cellulose using pyridine: acetone: 
ammoniimi hydroxide:water (15:9:1.5:6) (Desrosiers, R. and Tanguay (1988) J. Biol Chem, 

15 263:4686). Radioactive spots corresponding to the positions of the three forms of methylarginine 
(which all separated from each other) were removed by scraping the chromatogram, and 
quantified by liquid scintillation counting. Sources of standards: monomethyl-L-arginine and 
trimethyl-L-lysine, Calbiochem; N,N'-dimethyl-Larginine and monomethyl-L-lysine, Sigma; 
N,N-dimethyl-L-arginine, Chemical Dynmnics, Corp.; dimethyl-L-lysine, Serva. 

20 When histone H3 is methylated by CARMl , hydrolyzed to amino acids, 

derivatized, and analyzed by high performance hquid chromatography (as described above), all 
of the radioactivity from histone H3 co-elutes in a single peak along with the derivatized 
standards of N^-monomethylarginine and N^,N^dimethylarginine (which did not separate from 
each other). The radioactive peak was well separated from standards of 

25 N^,N'^-dimethylarginine, N^-monomethyllysine, N^-dimethyllysine, and N^-trimethyllysine. On 
thin layer chromatography of the hydrolyzate, approximately 70% of the radiolabel migrated with 
N^,N^-dimethylarginine (asymmetrically dimethylated in the guanidino group) and the remaining 
30% with N^-monomethylarginine, In confirmation of the HPLC results, no significant label 
migrated with N^,N'^-dimethylarginine (symmetrically dimethylated in the guanidino group). 

30 Methylation of mixed histones by PRMTl was previously shown to produce the same types of 
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methylated arginine residues (Gary and Clark 1998). However, while they produce the same 
types of methylated arginine residues, CARMl and PRMTl have dramatically different protein 
substrate specificities (Figure 8 and Table 1). Histone H4, nucleolin, fibrillarin and hnRNPAl, 
as well as the peptide substrate, all have arginine-containing glycine-rich motifs, whereas histone 
5 H3 does not (Najbauer, J. et al 1993; Lin, W-J. et al 1996; Genbank Accession Numbers, for 
calf thymus histone H3, 70749, and for histone H4, 70762). Thus, it appears that PRMTl prefers 
to methylate arginines found in the glycine-rich motifs, whereas CARMl targets a different 
arginine-containing motif in proteins. 

EXAMPLE 7 

1 0 Sites of CARMl methylation of histone H3 

CARMl methylated the following residues of histone H3, as determined by mass 
spectrometry analysis: arg2 (minor), argl7 (major), arg26 (major), and one or more of the 4 
arginine residues within the histone H3 peptide region comprising residues 128-134. N-terminal 
sequencing of histone H3 labeled by CARMl -mediated methylation confirmed that within the 
1 5 first 20 amino acids of histone H3, ^g2 was a minor methylation site, argl7 was a major 

methylation site, and argS was not methylated (those are the only three arg residues within the 
first 20 amino acids of H3). The sequencing run was only able to analyze the first 20 amino 
acids fi'om the N-terminus. 

EXAMPLES 

20 The role of methyltransferase activity in transcription 

This example show that CARMl 's methyltransferase activity is necessary for its 
activity as a coactivator of transcription. We made a mutation in the CARMl coding sequences 
that resulted in replacement of three amino acids, valine 189, leucine 190, and aspartic acid 191, 
with alanines. This VLD sequence is located in the region that is most highly conserved among 
25 different members of the protein arginine methyltransferase family (Figure 1) and is believed to 
be important for S-adenosyhnethionine binding and thus for methyltransferase activity (Lin, W-J. 
et al (1996) 1 Biol Cham. 271:15034-15044). This mutation completely eliminates the ability 
of the GST-CARMl fusion protein to methylate mixed histones (Figure 8C) and peptide 
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substrate Rl (SEQ ID N0:4). The same mutation essentially eliminates CARMl's ability to 
enhance transcriptional activation by a Gal4DBD-GRIPlc fusion protein (Figure 6A) or by the 
estrogen receptor (Figure 6B), Immunoblots of transfected C0S7 cell extracts indicated that both 
wild type and mutant CARMl were expressed at similar levels. The VLD mutant retains the 
5 ability to bind the C-terminal region of GRIPl (Figure 3B). The correlated loss of the 

methyltransferase activity and coactivator activity of CARMl indicates that methyltransferase 
activity is important for CARMl's coactivator function. 

EXAMPLE 9 

Synergy of CARMl with histone acetyl transferases and other protein arginine 
1 0 methyltransferases in transcriptional activation 

This example shows that CARMl and other protein arginine methyltransferases 
synergistically activate transcription with each other and with histone acetyl transferases. As 
shown in Figures 9, 10 and 1 1, cells were transiently transfected with combinations of plasmids 
encoding GROPl, CARMl, p300, PRMTl, PRMT2, and PRMT3. At low levels of expression of 

15 an appropriate nuclear receptor, in this case estrogen receptor (ER), a combination of GRIP 1 , 
CARMl and p300 are required for activation of the ER-dependent receptor gene (Figure lOB, 
left side). Similar effects are observed if PRMTl is substituted for CARMl, other pi 60 
coactivators are substituted for GRIPl, or CBP or P/CAF is substituted for p300. P300, CBP, 
and P/CAF all have histone acetyltransferase activity. This indicated that histone 

20 methyltransferases and histone acetyltransferases have cooperative or synergistic coactivator 
activity and suggests that methylation and acetylation of histones and/or other proteins in the 
transcription complex are cooperative processes in the activation of transcription. 

The activity of CARMl is also synergistic with those of PRMTl, PRMT2 or 
PRMT3. In cells expressing low levels of the orphan (i.e. no ligand) receptor, ERRl or ERR3, 
25 cotransfection of cells with plasmids encoding GRIPl, CARMl and PRMTl results in highly 
increased reporter gene expression as shown in Figure lOA. The synergy between CARMl and 
PRMTl is also observed with three other nuclear receptors: estrogen, androgen, and thyroid 
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hormone receptors (Figure 9). Figure 1 1 shows that CARMl also acts synergistically with either 
PRMT2orPRMT3. 

Furthermore, due to the very high degree of dependence of the reporter gene 
activity on the presence of CARMl and/or PRMTl, these conditions may prove useful for 
5 screening for inhibitors of the methyltransferase activity or the coactivator activity associated 
with these methyltransferases. Such transiently tranfected cells when they contain low levels of a 
nuclear recptor will express the nuclear receptor-dependent reporter gene, in this case luciferase, 
only in the presence of GRIPl, CARMl and p300. Molecules that inhibit either the enzymatic 
activities of these coactivators or the protein-protein interactions of these coactivators would 
10 reduce the level of signal from the reporter gene. 

EXAMPLE 10 
Anti-CARMl Antibody 

The peptide SEQ ID NO: 6: (C)SPMSIPTNTMHYGS-COOH, representing the 
C-terminal CARMl amino acid residues 595-608 was coupled to KLH and injected into rabbits. 

15 The (C) is not part of the CARMl sequence but was added for coupling to KLH. The antiserum 
was tested at a dilution of 1 :2000 in western blotting. The positive control was CARMl 
translated in vitro with no radioactive amino acids; the negative control was a parallel in vitro 
translation reaction with no CARMl mRNA. Products from these two reactions were separated 
by molecular weight by SDS-polyacrylamide gel electrophoresis, and the proteins were 

20 transferred from the gel to a nylon membrane. The membrane was incubated with the CARMl 
antiserm, a secondary HRP-coupled antibody, and visualized by Iviminescence. The positive 
control gave a very strong band at the expected size for CARMl, while the negative control gave 
no band at that position. 

All of the pubhcations which are cited within the body of the instant specification 
25 are hereby incorporated by reference in their entirety. 
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TABLE 2. SEQ ID NO: 1. M musculuc cDNA for CARMl (GenBank Accession No. 
AFl 17887). 

1 agggggcctg gagccggacc taagatggca gcggcggcag cgacggcggt ggggccgggt 
5 61 gcggggagcg ctggggtggc gggcccgggc ggcgcggggc cctgcgctac agtgtctgtg 

121 ttcccgggcg cccgcctcct cactatcggc gacgcgaacg gcgagatcca gcggcacgcg 
181 gagcagcagg cgctgcgcct tgaggtgcgc gccggaccag acgcggcggg catcgccctc 
241 tacagccatg aagatgtgtg tgttttcaag tgctcggtgt cccgagagac agagtgcagt 
301 cgtgtgggca gacagtcctt catcatcacc ctgggctgca acagcgtcct catccagttt 

10 361 gccacacccc acgatttctg ttctttctac aacatcctga aaacctgtcg gggccacaca 
421 ctggagcgct ctgtgttcag tgagcggaca gaggaatcct cagctgtgca gtacttccag 
481 ttctatggct acctatccca gcagcagaac atgatgcagg actatgtgcg gacaggcacc 
541 taccagcgtg cgatcctgca gaaccacacg gacttcaagg acaagatcgt tctagatgtg 
601 ggctgtggct ctgggatcct gtcatttttt gctgctcaag caggagccag gaaaatttat 

15 661 gcagtggaag ccagcaccat ggctcagcat gcagaggtcc tggtgaagag taacaatctg 
721 acagaccgca tcgtggtcat ccctggcaaa gtagaggagg tctcattgcc tgagcaagtg 
781 gacat tat ca tctcagagcc catgggctac atgctcttca atgaacgaat gctcgagagc 
841 tacctccatg ccaaaaagta cctgaagcct agtggaaaca tgttccccac cattggtgat 
901 gtccacctcg cacccttcac tgatgaacag ctctacatgg agcagttcac caaagccaac 

20 961 ttccggtacc agccatcctt ccatggagtg gacctgtcgg ccctcagagg tgccgctgtg 
1021 gatgagtact tccggcaacc tgtggtggac acatttgaca tccggatcct gatggccaaa 
1081 tctgtcaagt acacagtgaa cttcttagaa gccaaagaag gcgatttgca caggatagaa 
1141 atcccattca aattccacat gctgcattca gggctagtcc atggcttggc cttctggttc 
1201 gatgttgctt tcattggctc cataatgacc gtgtggctat ccacagcccc aacagagccc 

25 1261 ctgacccact ggtaccaggt ccggtgcctc ttccagtcac cgttgtttgc caaggccggg 
1321 gacacgctct cagggacatg tctgcttatt gccaacaaaa gacagagcta tgacatcagt 
13 81 attgtggcac aggtggacca gacaggctcc aagtccagta acctgctgga tctaaagaac 
1441 cccttcttca ggtacacagg tacaacccca tcacccccac ctggctcaca ctacacgtct 
1501 ccctcggaga atatgtggaa cacaggaagc acctataatc tcagcagcgg ggtggctgtg 

30 1561 gctggaatgc ctactgccta cgacctgagc agtgttattg ccggcggctc cagtgtgggt 
1621 cacaacaacc tgattccctt agctaacaca gggattgtca atcacaccca ctcccggatg 
1681 ggctccataa tgagcacggg cattgtccaa ggctcctcag gtgcccaggg aggcggcggt 
1741 agctccagtg cccactatgc agtcaacaac cagttcacca tgggtggccc tgccatctct 
1801 atggcctcgc ccatgtccat cccgaccaac accatgcact atgggagtta ggtgcctcca 

35 1861 gccgcgacag cactgcgcac tgacagcacc aggaaaccaa atcaagtcca ggcccggcac 
1921 agccagtggc tgttccccct tgttctggag aagttgttga acacccggtc acagcctcct 
1981 tgctatggga acttggacaa ttttgtacac gatgtcgccg ctgccctcaa gtacccccag 
2041 cccaaccttt ggtcccgagc gcgtgttgct gccatacttt acatgagatc ctgttggggc 
2101 agccctcatc ctgttctgta ctctccactc tgacctggct ttgacatctg ctggaagagg 

40 2161 caagtcctcc cccaaccccc acagctgcac ctgaccaggc aggaggaggc cagcagctgc 
2221 caccacagac ctggcagcac ccaccccaca acccgtcctt gcacctcccc tcacctgggg 
22 81 tggcagcaca gccagctgga cctctccttc aactaccagg ccacatggtc accatgggcg 
2341 tgacatgctg ctttttttaa ttttattttt ttacgaaaag aaccagtgtc aacccacaga 
2401 ccctctgaga aacccggctg gcgcgccaag ccagcagccc ctgttcctag gcccagaggt 

45 2461 tctaggtgag gggtggccct gtcaagcctt cagagtgggc acagcccctc ccaccaaagg 
2521 gttcacctca aacttgaatg tacaaaccac ccagctgtcc aaaggcctag tccctacttt 
2581 ctgctactgt cctgtcctga gccctgaagg cccccctcca tcaaaagctt gaacaggcag 
2641 cccagagtgt gtcaccctgg gctactgggg cagacaagaa acctcaaaga tctgtcacac 
2701 acacacaagg aaggcgtcct ctcctgatag ctgacatagg cctgtgtgtt gcgttcacat 

50 2761 tcatgttcta cttaatcctc tcaagacagc aaccctggga aggagcctcg cagggacctc 

2 821 cccagacaag aagaaaagca aacaaggaag ggtgattaat aagcacaggc agtttcccct 
2 881 attcccttac cctagagtcc ccacctgaat ggccacagcc tgccacagga accccttggc 
2941 aaaggctgga gctgctctgt gccaccctcc tgacctgtca gggaatcaca gggccctcag 
3001 gcagctggga accaggctct ctcctgtcca tcagtaatac tccttgctcg gatggccctc 

55 3061 ccccaccttt atataaattc tctggatcac ctttgcatag aaaataaaag tgtttgcttt 
3121 gtaa 
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TABLE 3, SEQ E) NO: 2. Deduced amino acid sequence of CARMl (GenBank Accession No. 
AAD41265). 

1 maaaaatavg pgagsagvag pggagpcatv svfpgarllt igdangeiqr haeqqalrle 
5 61 vragpdaagi alyshedvcv fkcsvsrete csrvgrqsfi itlgcnsvli qfatphdfcs 
121 fynilktcrg htlersvfse rteessavqy fqfygylsqq qnmmqdyvrt gtyqrailqn 
181 htdfkdkivl dvgcgsgils ffaaqagark iyaveastma qhaevlvksn nltdriwip 
241 gkveevslpe qvdiiisepm gymlfnerml esylhakkyl kpsgntufpti gdvhlapftd 
301 eqlymeqftk anfryqpsfh gvdlsalrga avdeyfrqpv vdtfdirilm aksvkytvnf 
10 361 leakegdlhr ieipfkfhml hsglvhglaf wfdvafigsi mtvwlstapt eplthwyqvr 
421 clfqsplfak agdtlsgtcl liankrqsyd isivaqydqt gskssnlldl knpffrytgt 
481 tpspppgshy tspsenmwnt gstynlssgv avagmptayd Issviaggss vghnnlipla 
541 ntgivnhths rmgsimstgi vqgssgaqgg ggsssahyav nnqftmggpa ismaspmsip 
601 tntmhygs 
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TABLE 4. SEQ ID NO: 3. Sequence of CARMl VLD to AAA variant. 



1 maaaaatavg pgagsagvag pggagpcatv svfpgarllt igdangeiqr haeqqalrle 
20 61 vragpdaagi alyshedvcv fkcsvsrete csivgrqsfi itlgcnsvli qfatphdfcs 
121 fynilktcrg htlersvfse rteessavqy fqfygylsqq qnmmqdyvrt gtyqrailqn 
181 htdfkdkiaa avgcgsgils ffaaqagark iyaveastma qhaevlvksn nltdriwip 
241 gkveevslpe qvdiiisepm gymlfnerml esylhakkyl kpsgnmfpti gdvhlapftd 
301 eqlymeqftk anfryqpsfh gvdlsalrga avdeyfrqpv vdtfdirilm aksvkytvnf 
25 361 leakegdlhr ieipfkfhml hsglvhglaf wfdvafigsi mtvwlstapt eplthwyqvr 
421 clfqsplfak agdtlsgtcl liankrqsyd isivaqydqt gskssnlldl knpffrytgt 
481 tpspppgshy tspsenmwnt gstynlssgv avagmptayd Issviaggss vghnnlipla 
541 ntgivnhths rmgsimstgi vqgssgaqgg ggsssahyav nnqftmggpa ismaspmsip 
601 tntmhygs 



TABLE 5. SEQ ID NOS: 4 and 5. Peptides used for in vitro methylation experiments. 

Rl peptide SEQ ID NO: 4 GGFGGRGGFG 
Kl peptide SEQ ID NO: 5 GGFGGKGGFG 

TABLE 6. SEQ ID NO: 6. Peptide used to generate anti-CARMl antisera. 

SEQ ID NO: 6: CSPMS IP TNTMHYGS 
TABLE 7. SEQ ID NO: 7. Human PRMTl (GenBank Accession No, CAA71765), 



1 mevscgqaes sekpnaedmt skdyyfdsya hfgiheemlk devrtltyrn smfhnrhlfk 
61 dkwldvgsg tgilcmfaak agarkvigiv cssisdyavk ivkankldhv vtiikgkvee 
45 121 velpvekvdi iisewmgycl fyesmlntvl yardkwlapd glifpdratl yvtaiedrqy 
181 kdykihwwen vygfdmscik dvaikeplvd wdpkqlvtn aclikevdiy tvkvedltft 
241 spfclqykrn dyvhalvayf nieftrchkr tgfstspesp ythwkqtvfy medyltvktg 
301 eeifgtigmr pnaknnrdld ftidldfkgq Icelscstdy rmr 
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TABLE 8. SEQ ID NO: 8. Human PRMT2 (GenBank Accession No. CAA67599) 



5 1 matsgdcprs esqgeepaec seagllqegv qpeefvaiad yaatdetqls flrgekilil 

61 rqttadwwwg eragccgyip anhvgkhvde ydpedtwqde eyfgsygtlk Ihlemladqp 

121 rttkyhsvil qnkesltdkv ildvgcgtgi islfcahyar pravyaveas emaqhtgqlv 

181 Iqngfadiit vyqgkvedw Ipekvdvlvs ewmgtcllfe fmiesilyar dawlkedgvi 

241 wptmaalhlv pcsadkdyrs kvlfwdnaye fnlsalksla vkeffskpky nhilkpedcl 

10 3 01 sepctilqld mrtvqisdle tlrgelrfdi rkagtlhgft awfsvhfqsl qegqppqyls 

361 tgpfhptthw kgtlfmmddp vpvhtgdwt gswlqrnpv wrrhmsvals wavtsrqdpt 

421 sqkvgekvfp iwr 

TABLE 9. SEQ ID NO: 9. Human PRMT3 (GenBank Accession No. AAC39837) 

1 depelsdsgd eaawededda dlphgkqqtp clfcnrlfts aeetfshcks ehqfnidsmv 
61 hkhglefygy iklinfirlk nptveymnsi ynpvpwekee ylkpvleddl llqfdvedly 
121 epvsvpfsyp nglsentsw eklkhmeara Isaeaalara redlqkmkqf aqdfvmhtdv 
181 rtcssstsvi adlqededgv yfssyghygi heemlkdkir tesyrdfiyq nphifkdkw 
20 241 Idvgcgtgil smfaakagak kvlgvdqsei lyqamdiirl nkledtitli kgkieevhlp 
3 01 vekvdviise wmgyfllfes mldsvlyakn kylakggsvy pdictislva vsdvnkhadr 
361 iafwddvygf kmscmkkavi peawevldp ktlisepcgi khidchttsi sdlefssdft 
421 Ikitrtsmct aiagyfdiyf eknchnrwf stgpqstkth wkqtvfllek pfsvkageal 
481 kgkvtvhknk kdprsltvtl tlnnstqtyg Iq 

25 

TABLE 10. SEQ ID NO: 10. Yease ODPl Protein Arginine Methyltransferase. (GenBank 
Accession No. 6319508) 

30 1 msktavkdsa tektklsese qhyfnsydhy giheemlqdt vrtlsyrnai iqnkdlfkdk 
61 ivldvgcgtg ilsmfaakhg akhvigvdms siiemakelv elngfsdkit llrgkledvh 
121 Ipfpkvdiii sewmgyflly esmmdtvlya rdhylveggl ifpdkcsihl agledsqykd 
181 eklnywqdvy gfdyspfvpl vlhepivdtv ernnvnttsd kliefdlntv kisdlafksn 
241 fkltakrqdm ingivtwfdi vfpapkgkrp vefstgphap ythwkqtify fpddldaetg 

35 301 dtiegelvcs pneknnrdln ikisykfesn gidgnsrsrk negsylmh 
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CLAIMS 



What is claimed is: 

1 . An isolated nucleic acid molecule comprising a sequence substantially 
equivalent to that of SEQ ID NO: 1 or a fragment thereof having at least about 40 nucleotides. 

2. A recombinant vector comprising the nucleic acid molecule of claim 1 . 

3. A genetically engineered cell comprising the recombinant vector of claim 

2. 

4. An isolated polypeptide comprising an amino acid sequence substantially 
equivalent to that of SEQ JD NO: 2 or a fragment thereof 

5. The isolated polypeptide of claim 4 further comprising a purification 

domain. 

6. The isolated polypeptide of claim 5 wherein said purification domain is 
gluathione-S-transferase. 

7. The isolated polypeptide of claim 4 further comprising a DNA binding 

domain. 

8. The isolated polypeptide of claim 7 wherein said DNA binding domain is 
the Gal4 DNA binding domain. 

9. The isolated polypeptide of claim 4 further comprising a transcription 
activation domain. 

10. The isolated polypeptide of claim 9 wherein said transcription activation 
domain is the Gal4 transcription activation domain. 

1 1 . The isolated polypeptide of claim 4 wherein said peptide substantially 
lacks methyltransferase activity, but retains the ability to bind to pi 60 proteins. 

45 
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12. The isolated polypeptide of claim 1 1 wherein the substitution of residues 
189, 190, and 191 of SEQ ID NO: 2 result in the substantial lack of methyltransferase activity 
and retention of pi 60 binding ability. 

1 3 . The isolated polypeptide of claim 12 which has the amino acid sequence 
of SEQ ID NO: 3 or a fragment thereof. 

14. An antibody directed towards the isolated polypeptide of claim 4. 

15. The antibody of claim 14 wherein said aatibody is monoclonal. 

16. The antibody of claim 14 wherein said antibody is polyclonal. 

17. A method for methylating amino acid residues within a substrate 
polypeptide comprising contacting the polypeptide of claim 4 with said substrate polypeptide in 
the presence of S-adenosyhnethionine. 

18. The method of claim 17 wherein said substrate amino acid residue that is 
methylated is arginine. 

19. The method of claim 17 wherein said substrate polypeptide is a histone. 

20. The method of claim 19 wherein said histone is histone H3. 

21 . A methylated histone H3 or fragment thereof produced according to the 
method of claim 17. 

22. An antibody directed towards the methylated histone of claim 21 , 

23. A method for screening for molecules that modulate an interaction 
between CARMl and GRIP-1 comprising: 

expressing within a host cell a first recombinant protein comprising SEQ 
ID NO: 2 or a fragment thereof fused to a first interaction domain; 
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expressing within said host cell a second recombinant protein comprising 
a CARMl -interacting protein or fragment thereof fused to a second interaction domain; 

providing in said host cell a reporter gene construct comprising a reporter 
gene and a promoter region which interacts with either said first interaction domain or said 
second interaction domain and wherein said first recombinant protein and said second 
recombinant protein interact with sufficient affinity to facilitate expression of said reporter gene; 
and 

measuring said reporter gene expression level in the presence of a 

modulating molecule. 

24. The method of claim 23 wherein said first interaction domain is a DNA 
binding domain and said second interaction domain is a transcriptional activation domain, 

25. The method of claim 23 wherein said first interaction domain is a 
transcriptional activation domain and said second interaction domain is a DNA binding domain. 

26. The method of claim 23 wherein said reporter gene construct comprises a 
nucleic acid molecule encoding p-galactosidase, green fluorescent protein, luciferase or 
P-lactamase. 

27. A method for modulating expression of a nuclear receptor-dependent gene 
in a cell comprising: 

expressing in said cell a first nucleic acid molecule encoding a protein 
arginine methyltransferase; 

expressing in said cell a second nucleic acid molecule encoding a pl60 

coactivator; and 

wherein the co-expression of said first and said second nucleic acid 
molecules modulates the expression level of said nuclear receptor-dependent gene. 
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28. The method of claim 27 wherein said protein arginine methyltransferase is 
CARMl, PRMTl, PRMT2 or PRMT3. 

29. The method of claim 28 further comprising expressing a second protein 
arginine methyltransferase. 

30. The method of claim 29 wherein said first protein arginine 
methyltransferase is CARMl, 

3 1 . The method of claim 30 wherein said second protein arginine 
methyltransferase is PRMTl, PRMT2 or PRMT3. 

32. The method of claim 28 further comprising expressing a second 
coactivator wherein said second coactivator possesses histone acetyltransferase activity. 

33. The method of claim 32 wherein said second coactivator is CBP, P/CAF 

orp300. 

34. A method for screening of molecules that modulate CARMl coactivator 
activity in a cell comprising: 

expressing in said cell a nuclear receptor-dependent reporter gene; 

expressing in said cell CARMl; 

expressing in said cell a pl60 coactivator; 

expressing in said cell a second coactivator with histone acetyltransferase activity 
or a second protein arginine methyltransferase; 

expressing in said cell a nuclear receptor gene, wherein said nuclear receptor gene 
is expressed at a level such that expression of said reporter gene is at least about 10-fold higher 
than in a cell not expressing either CARMl, a pl60 coactivator, or either a second coactivator or 
a second protein arginine methyltransferase; and 
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comparing the expression levels of said reporter gene in said cell in the presence 
and absence of a modulating compound. 

35. The method of claim 34 wherein said pl60 coactivator is GRIPl, SRC-1 

orp/CIP. 

36. The method of claim 35 wherein said pl60 coactivator is GRIPl . 

37. The method of claim 35 wherein said second coactivator is p300, CBP or 

p/CAF. 

38. The method of claim 35 wherein said second protein arginine 
methyltransferase is PRMTl, PRMT2 or PRMT3. 
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ABSTRACT 



The invention relates to the cDNA and deduced amino acid sequence of the 
Coactivator Associated arginine (R) Methyltransferase protein, CARMl . A method is described 
for the use CARMl to regulate gene expression in vivo, CARMl has also been used to 
methylate arginine residues of histones, synthetic peptides, and other proteins, A method to use 
CARMl to screen for drugs that inhibit its methyltransferase activity is also described, as is a 
method to screen for drugs that modulate CARMl 's interactions with other proteins. 
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Synergy among three coactivators with different levels of ER 
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